Syntax Reference
This page is a compact reference for Pivotal statements and options. It is intended for checking exact command shapes; see the user guide topic pages for longer examples and explanation.
Notation
| Placeholder | Meaning |
|---|---|
<table> |
Pivotal table/DataFrame name |
<col> |
Column name |
<name> |
Identifier used for a table, column, chart, function, or native value |
<value> |
Boolean, number, string, identifier, path, None, or Python variable |
<expr> |
Raw expression text parsed by the selected backend |
<condition> |
One or more comparisons joined by and or or |
:var / :func(col) |
Python runtime variable or callable |
[ ... ] |
Optional syntax in this reference, unless shown inside code as a list |
Pivotal identifiers start with a letter and may contain letters, numbers, and
underscores. Strings may use single or double quotes. Comments may use #,
--, or /* ... */.
Values, Lists, and Conditions
Accepted value forms:
True
False
None
123
-123
12.5
"text"
'text'
identifier
:python_variable
:python_list[0]
:python_dict["key"]
path/to/file.csv
Accepted list forms:
Conditions use a column on the left:
<col> == <value>
<col> != <value>
<col> > <value>
<col> < <value>
<col> >= <value>
<col> <= <value>
<col> between [<low>, <high>]
<col> contains <value>
<col> not contains <value>
<col> matches <value>
<col> not matches <value>
<col> startswith <value>
<col> endswith <value>
<col> in <list-or-name-or-python-var>
<col> not in <list-or-name-or-python-var>
Multiple conditions may be joined with and or or.
Native Values
list
Define a reusable list:
List values may be literals, identifiers, paths, or Python variables. Named lists can be used in column lists, filter value lists, loop sources, and function arguments.
Index into a known list with square brackets:
scalar
Define a reusable single value:
Examples:
dict
Define a nested dictionary inline:
Load a dictionary from JSON or YAML:
Bind an existing Python dictionary:
Access known dictionary values with dot paths:
Native values persist in the Python namespace. Later cells may also use
:name, :name[0], or :name["key"] runtime references, but native lookup is
preferred inside Pivotal code when the value belongs to the pipeline.
Data Sources
load
Load a file into a table:
load <path-or-string-or-python-var> as <table>
load <path-or-string-or-python-var> as <table>
<reader_option> <value>
<reader_option> = <value>
<reader_option> [<value>, ...]
Examples:
load reader options are pass-through keyword arguments for the generated file
reader. CSV files use read_csv, Excel files use read_excel, and Parquet
files use read_parquet.
Load tables from the active package:
bulk load
Load a list of CSV or Parquet files.
bulk load :files as <table>
bulk load "<file1>", "<file2>" as <table>
bulk load :files as <table1>, <table2>, ...
bulk load :files as :aliases
bulk load :files as <table>
source column <col>
source value path|filename|stem
<reader_option> <value>
With one static alias, files are concatenated into one table and a source column is added. With multiple aliases, or with a Python alias list, each file is loaded as a separate table.
Defaults:
| Option | Default |
|---|---|
source column |
source |
source value |
filename |
from
Load tables or query results from a database.
The block may contain any mix of load and query lines. Supported source
forms include SQLite paths, DuckDB paths, SQLAlchemy URIs, and Python variables
containing those values.
Table Blocks
with
Set the active table:
Create a derived table and make it active:
Statements that transform a table are normally indented under with.
Assignment
Create or replace a column on the active table:
Conditional assignment:
Grouped/windowed assignment:
Multi-case assignment:
The else branch is optional. Without it, unmatched rows receive the backend's
missing value.
Python runtime functions in assignment expressions must use ::
Bare function calls are reserved for built-in Pivotal functions and backend-native functions.
for
Repeat column-oriented operations over a list of columns.
Inside a for block, assignment targets can be built from identifiers and
string literals:
Allowed statements inside for are assignment, cast, fillna, drop,
dropna, round, rank, lag, lead, cumsum, cummean, cummin,
cummax, and rolling.
Rows and Columns
filter
Keep rows matching a condition:
assert and check
Validate data quality. assert fails on invalid data; check warns and
continues.
assert <condition>
check <condition>
assert <col>, ... unique
check <col>, ... unique
assert <col>, ... not null
check <col>, ... not null
select
Keep selected columns:
select <col>, ...
select <col> as <new_col>, ...
select :python_column_or_columns
select matches "<regex>"
Aliases are supported for explicit column names. A Python variable may provide a single column or a list of columns.
drop
Remove columns:
rename
Rename columns:
sort / order by
Sort rows:
Ascending order is the default.
distinct
Drop duplicate rows:
Without columns, duplicates are checked across the full row.
fillna
Fill missing values:
dropna
Drop rows with missing values:
cast
Cast one or more columns:
Without strict, bad values are coerced where the backend supports coercion.
round
Round numeric columns:
The as form is valid only with a single source column.
Aggregation and Reshaping
group by
Group and aggregate the active table:
If agg is omitted, backends use their default grouped aggregation behavior.
agg
Aggregate the whole active table without grouping:
Aggregation item forms:
<func> <col-or-python-var> [as <new_col>]
<func>(<col-or-python-var>) [as <new_col>]
quantile <col-or-python-var> <q> [as <new_col>]
quantile(<col-or-python-var>, <q>) [as <new_col>]
percentile <col-or-python-var> <p> [as <new_col>]
percentile(<col-or-python-var>, <p>) [as <new_col>]
wmean(<col-or-python-var>, <weight-col-or-python-var>) [as <new_col>]
wmean <weight-col-or-python-var> <col-or-python-var> [as <new_col>]
wmean <weight-col-or-python-var> <col1> <col2> ...
:python_func <col> ... [as <new_col>]
:python_func(<col>, ..., <kw>=<value>) [as <new_col>]
Built-in aggregation functions accepted by the grammar are mean, avg, sum,
min, max, count, median, std, nunique, quantile, and percentile.
Older wavg syntax is accepted as a backward-compatible alias for wmean, but
new code should use wmean.
Pivotal also expands shorthand such as:
to:
pivot
Create a pivot table:
rows, cols, and agg lines may appear in any order. Use the same aggregation
item forms as group by.
unpivot
Convert wide data to long data:
unpivot
id <col-or-python-var>, ...
cols <col-or-python-var>, ...
variable "<variable_column_name>"
value "<value_column_name>"
Defaults:
| Option | Default |
|---|---|
variable |
variable |
value |
value |
Combining Tables
merge
Join another table to the active table:
merge <right_table>
<left|right|inner|outer> merge <right_table>
merge <right_table> on <key>, ...
<left|right|inner|outer> merge <right_table> on <key>, ...
<merge_option> <value>
<merge_option> = <value>
The default merge type is inner. Indented merge options are pass-through
keyword arguments for the generated backend merge/join call.
concat
Append tables vertically to the active table:
intersect
Keep rows in the active table that also appear in the listed tables:
exclude
Remove rows from the active table that appear in the listed tables:
Window Functions
rank
Rank rows by a column:
Ascending order is the default. pct returns percentile ranks.
lag and lead
Read values from prior or later rows:
lag <col> <periods> as <new_col>
by <col>, ...
order <col>
lead <col> <periods> as <new_col>
by <col>, ...
order <col>
Use order to make the row order explicit.
Cumulative functions
Compute cumulative values:
cumsum <col> as <new_col>
by <col>, ...
order <col>
cummean <col> as <new_col>
by <col>, ...
order <col>
cummin <col> as <new_col>
by <col>, ...
order <col>
cummax <col> as <new_col>
by <col>, ...
order <col>
rolling
Compute rolling-window values:
Rolling aggregation functions use the same grammar token as agg: mean,
avg, sum, min, max, count, median, std, and nunique. Backend
support for individual rolling functions may vary.
Output
show
Display the active table:
plot
Create or update a matplotlib chart:
plot <name>
plot <kind> <name>
plot <kind> on <existing_chart_name>
plot <name>
<plot_option> <value>
<plot_option> = <value>
<plot_option> [<value>, ...]
by <col>
cols <n>
show
Common plot options include kind, x, y, title, legend, c,
colormap, style, and canvas. Most options are passed through to pandas or
matplotlib plotting code.
Labels can be supplied after a value:
pivot plot
Group, aggregate, and plot in one command:
pivot plot <name>
pivot plot <kind> <name>
x <col> ["label"]
y <func> <col>, <func> <col>, ... ["label"]
by <col>
cols <n>
canvas <name>
show
table
Create a Great Tables table:
table <name>
title "<text>"
subtitle "<text>"
font size <number>
font "<family>"
stub <col>
stub <col> "<label>"
stub <col>, <group_col>
stub <col>, <group_col> "<label>"
stripe
canvas <name>
label <col> as "<label>", ...
format <col> as <format_type>
format <format_type>
style "<path>"
summary <summary_spec>, ...
spanner <col>, ... "<label>"
auto spanner
show
Format types:
Summary specs:
Python Integration
python
Run a single Python statement:
Run a Python block:
python...end blocks are preprocessed into a single Python statement before
parsing, so comments and indentation inside the Python code are preserved.
apply
Apply a Python function to the active table:
The function receives the active DataFrame and must return a DataFrame.
Packages and Session Objects
save
Save outputs to a Pivotal data package:
save "<package_name>"
save "<package_name>"
path "<directory>"
path :python_path
format <csv|parquet>
chart_format <png|svg>
include <name>, ...
exclude <name>, ...
delete
Remove a table from the session namespace:
Pipeline Functions
function
Define a reusable non-recursive pipeline:
return is optional. It records output metadata for Python-callable wrappers.
Function calls
Call a Pivotal pipeline function:
Function arguments may be values, Python variables, named lists, or inline round-bracket lists: