DataFrame Operations = default
What and how it can be used:
The DataFrame Operations component performs various operations on a DataFrame. It provides powerful data manipulation capabilities similar to pandas DataFrames in Python or data frames in R, enabling operations like filtering, sorting, grouping, aggregating, joining, pivoting, and statistical analysis on tabular data structures.

When/how the component should be used:
- Use when you need to perform complex operations on tabular data.
- Best for working with structured data in rows and columns.
- Perfect for aggregations, joins, and advanced data transformations.
- Create a new flow or use an existing flow.
- Add a DataFrame Operations component to the flow, and then connect DataFrame output from another component to the DataFrame input.
- In the Operations field, select the operation you want to perform on the incoming DataFrame. For example, the Filter operation filters the rows based on a specified column and value.
- Configure the operation’s parameters. The specific parameters depend on the selected operation. For example, if you select the Filter operation, you must define a filter condition using the Column Name, Filter Value, and Filter Operator parameters
Connections with other components:
- ChatOutput
- Batch Run
- Parser
- Save File
- Smart Function
- Split Text
- Type Convert
- Notify
- ChromaDB
Configurable settings:
- DataFrame
- Operation
Default settings:
- DataFrame
- Operation
Control Section:
- DataFrame
- Operation
Desired Behaviour:
- Preserves column structure
- Applies operations reliably
DataFrame Operations = Add Column
What and how it can be used:
The Add Column operation adds a new column to the DataFrame with a constant value. It creates a new column in the tabular data structure and populates all rows with the same specified value. This component enables adding fixed values, default settings, or static metadata to every row in a dataset.
When/how the component should be used:
- Use when you need to add a column with the same value for all rows
- Use to create a new column.
Connections with other components:
- ChatOutput
- Batch Run
- Parser
- Save File
- Smart Function
- Split Text
- Type Convert
- Notify
- ChromaDB
Configurable settings:
- DataFrame
- Operation
- New Column Name
- New Column Value
Default settings:
- DataFrame
- Operation
- New Column Name
- New Column Value
Control Section:
- DataFrame
- Operation
- New Column Name
- New Column Value
Default values:
- Operation : Add Column
Desired Behaviour:
- New column added without affecting others
- Preserve row count
DataFrame Operations = Drop Column
What and how it can be used:
The Drop Column operation removes a column from the DataFrame, specified by Column Name. It deletes one or more columns from the tabular data structure, creating a new DataFrame without the specified column. This component enables removing unnecessary, redundant, or sensitive fields from datasets.
When/how the component should be used:
- Use when you need to remove unwanted columns from a DataFrame
- Use to remove columns that are not needed downstream.
- Use to reduce data size and complexity.
- Use to eliminate sensitive or irrelevant fields.
- Use before exporting, embedding, routing, or analysis.
Connections with other components:
- ChatOutput
- Batch Run
- Parser
- Save File
- Smart Function
- Split Text
- Type Convert
- Notify
- ChromaDB
Configurable settings:
- DataFrame
- Operation
- Column Name
Default settings:
- DataFrame
- Operation
- Column Name
Control Section:
- DataFrame
- Operation
- Column Name
Default values:
- Operation : Drop Column
Desired Behaviour:
- Remove the specified column.
- Leave remaining data unchanged.
- Show an error if the column doesn’t exist.
DataFrame Operations = Filter
What and how it can be used:
The Filter operation filters the DataFrame based on a specified condition. The output is a DataFrame containing only the rows that matched the filter condition. This component enables selecting subsets of data by applying logical conditions on column values, returning only rows where the condition evaluates to true.
When/how the component should be used:
- Use to keep rows that meet specific conditions.
- Use before analysis, alerts, or reporting.
Connections with other components:
- ChatOutput
- Batch Run
- Parser
- Save File
- Smart Function
- Split Text
- Type Convert
- Notify
- ChromaDB
Configurable settings:
- DataFrame
- Operation
- Column Name
- Filter Value
- Filter Operator
Default settings:
- DataFrame
- Operation
- Column Name
- Filter Value
- Filter Operator
Control Section:
- DataFrame
- Operation
- Column Name
- Filter Value
- Filter Operator
Default values:
- Operation : Filter
- Filter Operator : equals
Desired Behaviour:
- Applies condition row-by-row
- Outputs only matching rows
DataFrame Operations = Head
What and how it can be used:
The Head operation retrieves the first n rows of the DataFrame, where n is set in the Number of Rows. The default is 5. This component enables quick inspection of the beginning of a dataset, useful for previewing data structure, validating imports, or displaying sample records without loading the entire dataset.
When/how the component should be used:
- Used to inspect the first N rows of a table.
- Use for quick data validation and sanity checks.
- Use when you want a small sample without processing the full dataset.
- Use during development, debugging, or exploration.
Connections with other components:
- ChatOutput
- Batch Run
- Parser
- Save File
- Smart Function
- Split Text
- Type Convert
- Notify
- ChromaDB
Configurable settings:
- DataFrame
- Operation
- Number of Rows
Default settings:
- DataFrame
- Operation
- Number of Rows
Control Section:
- DataFrame
- Operation
- Number of Rows
Default values:
- Operation : Filter
- Filter Operator : equals
Desired Behaviour:
- Returns the first N rows based on current order
- Does not change column types or values
- If fewer than N rows exist, returns all rows
DataFrame Operations = Rename Column
What and how it can be used:
The Rename Column operation renames an existing column in the DataFrame. It changes the name of one or more columns while preserving all data values and the order of columns. This component enables updating column names to follow naming conventions, improve clarity, or match expected schemas.
When/how the component should be used:
- Used to standardize column names in a schema.
- Use when integrating heterogeneous sources.
Connections with other components:
- ChatOutput
- Batch Run
- Parser
- Save File
- Smart Function
- Split Text
- Type Convert
- Notify
- ChromaDB
Configurable settings:
- DataFrame
- Operation
- Column Name
- New Column Name
Default settings:
- DataFrame
- Operation
- Column Name
- New Column Name
Control Section:
- DataFrame
- Operation
- Column Name
- New Column Name
Default values:
- Operation : Rename Column
Desired Behaviour:
- Rename column exactly as specified.
DataFrame Operations = Replace Value
What and how it can be used:
The Replace Value operation replaces a target value with a new value. All cells matching the target value are replaced with the new value. This component enables data cleaning, normalization, and transformation by finding and replacing specific values throughout the entire DataFrame or within specific columns.
When/how the component should be used:
- Used to replace specific values in one or more columns.
- Used to standardize inconsistent values across datasets.
- Use to clean data before analysis, routing, or storage.
- Use when missing, placeholder, or invalid values must be corrected.
- Use when deterministic data cleanup is required (no inference).
Connections with other components:
- ChatOutput
- Batch Run
- Parser
- Save File
- Smart Function
- Split Text
- Type Convert
- Notify
- ChromaDB
Configurable settings:
- DataFrame
- Operation
- Column Name
- Value to Replace
- Replacement Value
Default settings:
- DataFrame
- Operation
- Column Name
- Value to Replace
- Replacement Value
Control Section:
- DataFrame
- Operation
- Column Name
- Value to Replace
- Replacement Value
Default values:
- Operation: Replace Value
Desired Behaviour:
- Replace only exact matches.
- Leave all other values unchanged.
DataFrame Operations = Select Columns
What and how it can be used:
The Select Columns operation selects one or more specific columns from the DataFrame. It creates a new DataFrame containing only the specified columns, effectively projecting a subset of the data by removing unwanted columns while preserving all rows. This component enables focusing on relevant fields and simplifying data structures.
When/how the component should be used:
- Use when you need to reduce data to relevant fields.
- Use to simplify downstream processing.
Connections with other components:
- ChatOutput
- Batch Run
- Parser
- Save File
- Smart Function
- Split Text
- Type Convert
- Notify
- ChromaDB
Configurable settings:
- DataFrame
- Operation
- Columns to Select
Default settings:
- DataFrame
- Operation
- Columns to Select
Control Section:
- DataFrame
- Operation
- Columns to Select
Default values:
- Operation: Select Columns
Desired Behaviour:
- Output only specified columns.
- Preserve row order.
- Fail clearly if a column is missing.
DataFrame Operations = Sort
What and how it can be used:
The Sort operation sorts the DataFrame on a specific column in ascending or descending order. It arranges rows based on the values in one or more columns, enabling data organization by numeric values, alphabetical order, dates, or custom sorting criteria. This component provides flexible sorting capabilities for data analysis and presentation.
When/how the component should be used:
- Use when order matters.
- Use to order data by priority, time, or score.
- Use before presentation or batch processing.
Connections with other components:
- ChatOutput
- Batch Run
- Parser
- Save File
- Smart Function
- Split Text
- Type Convert
- Notify
- ChromaDB
Configurable settings:
- DataFrame
- Operation
- Column Name
- Sort Ascending
Default settings:
- DataFrame
- Operation
- Column Name
- Sort Ascending
Control Section:
- DataFrame
- Operation
- Column Name
- Sort Ascending
Default values:
- Operation : Sort
- Sort Ascending = on
Desired Behaviour:
- Explicit ascending/descending order
DataFrame Operations = Tail
What and how it can be used:
The Tail operation retrieves the last n rows of the DataFrame, where n is set in Number of Rows. The default is 5. This component enables quick inspection of the end of a dataset, useful for viewing recent entries, validating data imports, or displaying the most recent records without loading the entire dataset.
When/how the component should be used:
- Use to inspect the last N rows of a table.
- Used to verify recent or final entries in time-ordered data.
- Use for debugging batch or streaming outputs.
- Use when order matters (e.g. timestamps).
Connections with other components:
- ChatOutput
- Batch Run
- Parser
- Save File
- Smart Function
- Split Text
- Type Convert
- Notify
- ChromaDB
Configurable settings:
- DataFrame
- Operation
- Number of Rows
Default settings:
- DataFrame
- Operation
- Number of Rows
Control Section:
- DataFrame
- Operation
- Number of Rows
Default values:
- Operation : Tail
Desired Behaviour:
- Return exactly the last N rows (if available).
- Preserve the existing row order.
DataFrame Operations = Drop Duplicates
What and how it can be used:
The Drop Duplicates operation removes rows from the DataFrame by identifying all duplicate values within a single column. It eliminates duplicate records based on specified columns, keeping only the first or last occurrence of each unique value. This component enables data cleaning by removing redundant entries and ensuring data uniqueness.
When/how the component should be used:
- Used to remove repeated records.
- Use before storage or reporting.
Connections with other components:
- ChatOutput
- Batch Run
- Parser
- Save File
- Smart Function
- Split Text
- Type Convert
- Notify
- ChromaDB
Configurable settings:
- DataFrame
- Operation
- Column Name
Default settings:
- DataFrame
- Operation
- Column Name
Control Section:
- DataFrame
- Operation
- Column Name
Default values:
- Operation: Drop Duplicates
Desired Behaviour:
- Keep only the first occurrence
