DataFusion Node Catalog
Generated category
DataFusion Node Catalog
Generated from 34 catalog nodes in Data/DataFusion.
Nodes in this category
Showing 34 of 34 generated node docs.
Create DataFusion Session
Data/DataFusionCreates a new DataFusion session for SQL analytics. Configure optimization settings for production workloads.
Mount CSV
Data/DataFusionMount CSV files from a FlowPath into a DataFusion session as a queryable table
Mount JSON
Data/DataFusionMount JSON (newline-delimited) files from a FlowPath into a DataFusion session as a queryable table
Mount Parquet
Data/DataFusionMount Parquet files from a FlowPath prefix into a DataFusion session as a queryable table
Register Lance Table
Data/DataFusionRegister a LanceDB table into a DataFusion session for SQL queries. Uses the existing to_datafusion() implementation from the vector store.
Register Table
Data/DataFusionRegister a CSVTable (from Excel/CSV extraction) into a DataFusion session for SQL queries. Converts the table to an in-memory Arrow table.
SQL Query
Data/DataFusionExecute a SQL query against a DataFusion session. Returns results as both a CSVTable (for analytics) and array of row objects (for iteration).
Date Truncate Aggregation
Data/DataFusion/AggregationTruncate timestamps to a specific precision (hour, day, month, etc.) and aggregate. Simpler alternative to date_bin for standard intervals.
Time Bin Aggregation
Data/DataFusion/AggregationCreate time-based aggregations using DataFusion's date_bin function. Groups data by fixed time intervals (minute, hour, day, etc.) and applies aggregation functions.
Window Aggregation
Data/DataFusion/AggregationApply window functions for rolling/moving aggregations over time series data.
Mount Athena S3 Results
Data/DataFusion/DatabasesMount Parquet files from an Athena query result location in S3. Supports explicit credentials or environment variables (including Lambda IAM roles).
Register Athena Table
Data/DataFusion/DatabasesRegister an AWS Athena table in DataFusion. Query S3 data via Athena's catalog. Supports explicit credentials or environment variables (including Lambda IAM roles).
Register BigQuery
Data/DataFusion/DatabasesRegister a Google BigQuery table or query result into a DataFusion session. Takes a GcpProvider for authentication — pair it with the GCP Provider node.
Register ClickHouse
Data/DataFusion/DatabasesRegister a ClickHouse table for federated queries. Uses real database connection for full SQL push-down.
Register DuckDB
Data/DataFusion/DatabasesRegister a DuckDB database table for federated queries. Uses real database connection.
Register FlightSQL
Data/DataFusion/DatabasesRegister a table via Arrow Flight SQL protocol. High-performance columnar data transfer (10-100x faster than JDBC/ODBC). Supports Dremio, InfluxDB, DuckDB Flight, ClickHouse Flight, and more.
Register MySQL
Data/DataFusion/DatabasesRegister a MySQL table for federated queries. Uses real database connection for full SQL push-down.
Register Oracle
Data/DataFusion/DatabasesRegister an Oracle database table for federated queries via ODBC. Requires Oracle Instant Client with ODBC driver installed.
Register PostgreSQL
Data/DataFusion/DatabasesRegister a PostgreSQL table for federated queries. Uses real database connection for full SQL push-down.
Register SQLite
Data/DataFusion/DatabasesRegister a SQLite database table for federated queries. Uses real database connection.
Delta Table Info
Data/DataFusion/LakesGet metadata and history information about a Delta table.
Delta Time Travel
Data/DataFusion/LakesLoad a specific version or timestamp of a Delta table for point-in-time queries.
Iceberg Table Info
Data/DataFusion/LakesGet metadata, snapshots, and history of an Apache Iceberg table from a metadata file.
Iceberg Time Travel
Data/DataFusion/LakesLoad a specific snapshot of an Iceberg table for point-in-time queries.
Register Delta Table
Data/DataFusion/LakesRegister a Delta Lake table in DataFusion using a FlowPath. Requires the 'delta' feature.
Register Hive Parquet
Data/DataFusion/LakesRegister Hive-partitioned Parquet files as a table in DataFusion using a FlowPath.
Register Iceberg Table
Data/DataFusion/LakesRegister an Apache Iceberg table in DataFusion from a metadata file. Supports schema evolution and partition pruning.
Register Partitioned JSON
Data/DataFusion/LakesRegister partitioned JSON/NDJSON files as a table in DataFusion using a FlowPath.
Write Delta Table
Data/DataFusion/LakesWrite query results to a new or existing Delta Lake table using FlowPath. Supports append, overwrite modes.
DateTime to SQL Timestamp
Data/DataFusion/TimeConvert a DateTime (ISO 8601 string) to SQL timestamp literal for use in DataFusion queries.
Time Range Filter
Data/DataFusion/TimeGenerate a SQL WHERE clause for filtering by time range. Supports relative time expressions.
Describe Table
Data/DataFusion/ToolsGet the schema (column names and types) of a table in a DataFusion session.
Execute SQL
Data/DataFusion/ToolsExecute a SQL query and return results as formatted text. Ideal for agent-driven data exploration.
List Tables
Data/DataFusion/ToolsList all tables registered in a DataFusion session. Returns array of table names.