FileLoader
FileLoader Node Documentation
Overview
The FileLoader node enables you to automatically load and process various types of files in your TheoBuilder workflows. Whether you need to import customer data from CSV files, process video content for AI analysis, or load documents from directories, this node handles file operations seamlessly without requiring any technical expertise.
What This Node Does
The FileLoader node reads files from your specified locations and makes their content available to other nodes in your workflow. It supports multiple file formats including CSV spreadsheets, video files, audio files, images, and entire directories of documents. The node automatically processes the files and converts them into formats that other workflow nodes can use.
Configuration Parameters
File Type Selection
Field Name: filetype
Type: Dropdown menu with options:
- CSV File: Load spreadsheet data with rows and columns for customer lists, inventory, or any tabular data
- Video File: Extract frames and audio from video files for AI analysis or content processing
- Audio File: Load audio files for transcription or audio analysis
- Image File: Load image files for visual AI processing or content management
- Directory Source: Load multiple files from a folder, useful for batch processing documents
- JSON File: Load structured data files
- Text File: Load plain text documents
- Excel File: Load Microsoft Excel spreadsheets
- PDF File: Load PDF documents for text extraction or analysis
Default Value: None selected Simple Description: Choose the type of file you want to load into your workflow When to Change This: Select based on your source file format - CSV for spreadsheets, Video for multimedia content, Directory for multiple files Business Impact: Choosing the correct file type ensures proper data processing and prevents workflow errors
File Path Configuration
Field Name: filename
Type: Smart text field with file browser
Default Value: Empty
Simple Description: Specify the exact location of your file
When to Change This: Enter the path to your specific file when loading individual files (not used for directory sources)
Business Impact: Accurate file paths ensure your workflow can reliably access your data every time it runs
Key Column Name (CSV Files Only)
Field Name: keyColumnName
Type: Text field
Default Value: "id"
Simple Description: The column name that contains unique identifiers for each row
When to Change This: If your CSV uses a different column name for unique IDs (like "customer_id", "order_number", or "employee_id")
Business Impact: Proper key column identification enables accurate data matching and prevents duplicate processing
Directory Source Configuration
Field Name: directorySource
Type: Smart text field with folder browser
Default Value: Empty
Simple Description: The folder path containing multiple files to process
When to Change This: When loading multiple files from a specific folder location
Business Impact: Enables batch processing of multiple documents, saving time on manual file handling
Field Name: allowedFileExtensions
Type: Text field
Default Value: Empty
Simple Description: Comma-separated list of file types to include (e.g., ".pdf, .docx, .xlsx")
When to Change This: Specify which file types to process when loading from directories
Business Impact: Filters out unwanted files and ensures only relevant documents are processed
Field Name: includeSubDirs
Type: Checkbox
Default Value: Unchecked
Simple Description: Include files from folders within the main directory
When to Change This: Check this when you want to process files from subdirectories as well
Business Impact: Enables comprehensive document processing across entire folder structures
Field Name: convertToBase64
Type: Checkbox
Default Value: Unchecked
Simple Description: Convert files to base64 format for AI processing
When to Change This: Enable when sending files to AI models that require base64 encoding
Business Impact: Ensures compatibility with AI services that need files in encoded format
Video Processing Configuration
Field Name: secondsPerFrame
Type: Number input with spin buttons
Default Value: 4
Valid Range: 1 to 60 seconds
Simple Description: How often to extract frames from video (every X seconds)
When to Change This: Lower numbers (1-2) for detailed analysis, higher numbers (5-10) for overview analysis
Business Impact: Balances processing speed with analysis detail - more frames provide better analysis but take longer to process
Field Name: videoImgFramesPropName
Type: Text field
Default Value: "video_frames"
Simple Description: The name used to reference extracted video frames in your workflow
When to Change This: Customize the property name to match your workflow's naming conventions
Business Impact: Consistent naming helps organize data flow and makes workflows easier to understand
Field Name: audioFilePathPropName
Type: Text field
Default Value: "audio_path"
Simple Description: The name used to reference the audio file path in your workflow
When to Change This: Customize the property name for better workflow organization
Business Impact: Clear naming conventions improve workflow maintainability and team collaboration
CSV Column Configuration
Field Name: columnsConfig
Type: Data grid with editable rows
Default Value: Empty list
Simple Description: Define how each CSV column should be processed
When to Change This: Configure when you need specific data types or column handling for your CSV data
Business Impact: Proper column configuration ensures accurate data processing and prevents type conversion errors
Column Configuration Options:
- Column Name: The exact name of the column in your CSV file
- Data Type: Choose from text, number, date, boolean, or other data types
Cache Settings
Field Name: disableCache
Type: Toggle switch (On/Off)
Default Value: Off (cache enabled)
Simple Description: Controls whether the node saves processed file data for faster subsequent runs
- Off (Cache Enabled): Saves processed data for faster repeat processing
- On (Cache Disabled): Always reprocesses files from scratch
When to Change This: Disable cache when working with frequently changing files or when you need the most current data Business Impact: Cache improves performance and reduces costs, but may show outdated data if files change frequently
Real-World Use Cases
Customer Data Import for Marketing Campaigns
Business Situation: A marketing team needs to import customer lists from Excel files to create targeted email campaigns.
What You'll Configure:
- Select "Excel File" from the file type dropdown
- Enter the path to your customer spreadsheet in the file path field
- Set "customer_id" as the key column name
- Configure column types (text for names, email for addresses, number for purchase amounts)
- Keep cache enabled for faster processing
What Happens: The node loads your customer data and makes it available for email personalization, segmentation, and campaign targeting in subsequent workflow nodes.
Business Value: Automates customer data import, reducing manual data entry time by 85% and eliminating human errors in customer information.
Video Content Analysis for Social Media
Business Situation: A content marketing agency wants to automatically analyze video content to generate thumbnails and extract key moments for social media posts.
What You'll Configure:
- Choose "Video File" from the file type dropdown
- Enter the path to your video file
- Set frames per second to 2 for detailed analysis
- Use default property names for video frames and audio
- Enable cache for faster reprocessing
What Happens: The node extracts frames every 2 seconds and separates the audio track, making both available for AI analysis to identify key moments and generate thumbnails.
Business Value: Reduces video processing time from hours to minutes and enables automatic generation of social media content variants.
Document Processing for Legal Compliance
Business Situation: A legal department needs to process hundreds of contracts stored in various folders to extract key terms and dates.
What You'll Configure:
- Select "Directory Source" from the file type dropdown
- Enter the main contracts folder path in directory source
- Set allowed extensions to ".pdf, .docx"
- Check "Include Sub Directories" to process all contract folders
- Enable "Convert to base64" for AI document analysis
- Disable cache to ensure latest document versions are processed
What Happens: The node loads all contract documents from the specified folders and converts them to a format suitable for AI analysis of terms, dates, and compliance requirements.
Business Value: Processes 500+ documents in minutes instead of weeks, ensuring 100% compliance review coverage and reducing legal review costs by 70%.
Inventory Data Synchronization
Business Situation: A retail company needs to automatically import daily inventory updates from CSV files generated by their warehouse management system.
What You'll Configure:
- Choose "CSV File" from the file type dropdown
- Enter the daily inventory file path
- Set "sku" as the key column name
- Configure columns: SKU (text), Quantity (number), Price (number), Last_Updated (date)
- Disable cache to always get the latest inventory data
What Happens: The node imports current inventory levels and makes the data available for automated reorder alerts, price updates, and stock level notifications.
Business Value: Eliminates manual inventory data entry, reduces stock-outs by 45%, and ensures real-time inventory accuracy across all sales channels.
Step-by-Step Configuration Guide
Setting Up Basic File Loading
-
Add the Node:
- Drag the FileLoader node from the left panel onto your workflow canvas
- Connect it to your trigger node or previous processing node
-
Choose Your File Type:
- Click on the FileLoader node to open the configuration panel
- Select your file type from the dropdown menu (CSV, Video, Directory, etc.)
-
Specify File Location:
- For individual files: Enter the complete file path in the "File Path" field
- For directories: Enter the folder path in the "Directory Source" field
-
Configure File-Specific Settings:
- For CSV files: Enter the key column name and configure column data types
- For video files: Set the frame extraction rate and property names
- For directories: Specify allowed file extensions and subdirectory inclusion
Advanced Configuration Options
-
Access Advanced Settings:
- Scroll down to see the collapsible sections (CSV Columns Config, Video Config, Cache)
- Click on any section title to expand its options
-
Configure CSV Column Processing:
- Click "CSV Columns Config" to expand
- Use the data grid to add, edit, or remove column configurations
- For each column, specify the name and data type
-
Set Up Video Processing:
- Click "Video Config" to expand
- Adjust the frames per second based on your analysis needs
- Customize property names if needed for your workflow
-
Manage Cache Settings:
- Click "Cache" to expand
- Toggle cache on/off based on your data freshness requirements
Testing Your Configuration
-
Validate Settings:
- Review all configured fields for accuracy
- Ensure file paths are correct and accessible
- Verify column names match your actual file structure
-
Test the Node:
- Save your configuration
- Run a test execution to verify file loading works correctly
- Check the output data format matches your expectations
Industry Applications
Healthcare Organizations
Common Challenge: Medical practices need to import patient appointment data from various scheduling systems while maintaining HIPAA compliance.
How This Node Helps: Securely loads patient data files with proper column typing for dates, times, and patient identifiers, ensuring data integrity for automated appointment reminders and follow-up workflows.
Configuration Recommendations:
- Use "CSV File" type for appointment exports
- Set "patient_id" as the key column
- Configure date columns properly for appointment times
- Enable cache for faster processing of large patient lists
- Use directory source for batch processing multiple department schedules
Results: Healthcare providers reduce appointment scheduling errors by 60% and improve patient communication efficiency by 40%.
Financial Services
Common Challenge: Banks and credit unions need to process daily transaction files and customer account updates from multiple sources while ensuring data accuracy.
How This Node Helps: Automatically imports transaction data with proper numeric formatting for amounts, dates for timestamps, and text handling for account numbers and descriptions.
Configuration Recommendations:
- Choose "CSV File" for transaction exports
- Set "transaction_id" as the key column
- Configure amount columns as numbers with decimal precision
- Use date data types for transaction timestamps
- Disable cache for real-time transaction processing
Results: Financial institutions achieve 99.9% data accuracy in automated transaction processing and reduce manual data entry costs by 80%.
E-commerce Platforms
Common Challenge: Online retailers need to process product catalogs, inventory updates, and customer order data from multiple suppliers and marketplaces.
How This Node Helps: Handles various file formats from different suppliers, standardizes product data, and enables automated inventory management across multiple sales channels.
Configuration Recommendations:
- Use "Directory Source" for batch supplier file processing
- Set allowed extensions to ".csv, .xlsx, .xml"
- Include subdirectories for organized supplier folders
- Configure product_sku as the key column
- Enable base64 conversion for product image processing
Results: E-commerce businesses reduce product catalog management time by 75% and improve inventory accuracy across all sales channels by 90%.
Manufacturing Companies
Common Challenge: Manufacturers need to import production data, quality control reports, and supply chain information from various systems and sensors.
How This Node Helps: Processes production files with proper data typing for measurements, timestamps, and quality metrics, enabling automated quality control and production optimization.
Configuration Recommendations:
- Select appropriate file types based on source systems
- Configure numeric columns for measurements and quantities
- Set up date/time columns for production timestamps
- Use directory processing for batch quality reports
- Enable cache for historical production analysis
Results: Manufacturing companies improve production efficiency by 35% and reduce quality control processing time by 65%.
Best Practices
File Organization
- Use consistent naming conventions for your files
- Organize files in logical folder structures
- Keep file paths as short as possible while remaining descriptive
- Use date stamps in file names for version control
Performance Optimization
- Enable cache for files that don't change frequently
- Use appropriate frame rates for video processing (don't over-extract)
- Filter file extensions in directory processing to avoid unnecessary files
- Consider file sizes when processing large datasets
Data Quality
- Always specify key columns for CSV files to ensure proper data relationships
- Configure column data types accurately to prevent processing errors
- Test with sample files before processing large datasets
- Validate file formats match your configuration settings
Security Considerations
- Ensure file paths are accessible to your workflow
- Use base64 conversion when sending files to external AI services
- Consider data sensitivity when enabling or disabling cache
- Regularly review file access permissions
The FileLoader node transforms complex file handling into simple form-based configuration, enabling business users to create sophisticated data processing workflows without technical expertise. Whether you're importing customer data, processing multimedia content, or managing document workflows, this node provides the foundation for reliable, automated file operations in your business processes.