Data Sources
load — load a file
Load a CSV, Excel, or Parquet file into a named table.
load "data/sales.csv" as sales
load "catalog.xlsx" as products
load "archive.parquet" as transactions
Use a Python variable for the path:
bulk load - load generated file lists
Load many CSV or Parquet files from a Python list. With one alias, Pivotal
concatenates the files into one table and adds a source column containing the
input filename.
The concat mode unions columns by name, so files with extra or missing columns are combined with missing values where needed.
Customize the provenance column:
source value can be filename, path, or stem.
Use multiple aliases to load each file separately:
Or provide aliases from a Python list:
The file list and alias list must have the same length. Aliases must be valid Pivotal table identifiers.
Options
Indent options beneath the load statement:
load "data/sales.csv" as sales
header 0 # row index of header (default 0)
names ["product", "quantity", "price"] # override column names
| Option | Description |
|---|---|
header <n> |
Row index to use as column headers (default 0) |
names [...] |
List of column names to use instead of the file header |
with — set or create a table
Set active table
Make an existing table the active table for subsequent operations:
Create a derived table
Create a new table by applying operations to an existing one. The original is unchanged:
Without as, operations apply to the named table in-place:
With as, a new table is created:
Chaining
Multiple with blocks can be chained — the output of one becomes the input of the next:
load "data.csv" as raw
with raw as cleaned
dropna price, quantity
fillna 0
with cleaned as summary
group by category
agg sum price as total
sort total desc
from — load from a database
Connect to a database and load one or more tables or query results into named DataFrames.
Load tables by name
Each load item reads the named SQL table with SELECT * FROM <table>. Multiple tables can be listed on a single line, comma-separated.
Run arbitrary SQL
Mix load and query in one block
from "data/mydb.sqlite"
load orders as orders_raw
query "SELECT category, SUM(amount) as total FROM orders GROUP BY category" as summary
The connection is opened once and all reads share it.
Supported sources
| Source | Example path | Notes |
|---|---|---|
| SQLite | "data/mydb.sqlite" |
Also .db, .sqlite3 |
| DuckDB | "data/warehouse.duckdb" |
Also .ddb |
| Python variable | :conn_str |
Variable must hold a path or URI |
| PostgreSQL URI | "postgresql://user:pass@host/db" |
Requires SQLAlchemy — see below |
| MySQL URI | "mysql+pymysql://user:pass@host/db" |
Requires SQLAlchemy + driver |
| Other SQLAlchemy | any dialect+driver://... URI |
Requires SQLAlchemy + driver |
SQLAlchemy support (PostgreSQL, MySQL, and others)
For databases other than SQLite and DuckDB, Pivotal uses SQLAlchemy as the connection layer. SQLAlchemy is a Python library that provides a unified interface to many databases via connection URI strings.
Install SQLAlchemy and a driver:
# PostgreSQL
pip install sqlalchemy psycopg2-binary
# MySQL
pip install sqlalchemy pymysql
# Microsoft SQL Server
pip install sqlalchemy pyodbc
Usage:
from "postgresql://user:password@localhost:5432/mydb"
load products as products
query "SELECT * FROM orders WHERE status = 'open'" as open_orders
If SQLAlchemy is not installed, Pivotal will raise a clear error with the install command rather than a cryptic import failure.
Polars backend: Uses
connectorxfor URI sources if available (faster), falling back to SQLAlchemy +pl.from_pandas(). Install withpip install connectorx.DuckDB backend: Uses DuckDB's native extensions (
postgres_scanner, etc.) for URI sources rather than SQLAlchemy.
delete — remove a table
Remove a table from memory:
In Jupyter, this is equivalent to del sales in Python.
Tables must have compatible columns.