Selection
select — keep columns
Keep only the specified columns, in the given order:
drop — remove columns
Remove specific columns, keep everything else:
distinct — remove duplicate rows
Deduplicate on all columns
Remove rows where every column is identical to another row:
Deduplicate on specific columns
Keep the first occurrence for each unique combination of the specified columns:
Note
When specific columns are provided, only those columns are kept in the result. If you want to deduplicate based on certain columns but keep all columns, combine with a merge or use apply with a Python function.
rename — rename columns
Multiple renames in one statement, separated by commas.