VGI provides CLI tools for invoking functions and managing catalogs without writing code.
| Command | Description |
|---|---|
vgi-client |
Invoke functions and manage catalogs |
vgi-fixture-worker |
Run the example worker with demo functions |
The main CLI for invoking VGI functions and managing catalogs.
vgi-client [OPTIONS]Options:
| Option | Description |
|---|---|
--input FILE |
Input parquet file (omit for table functions) |
--output FILE |
Output file (use - for stdout) |
--format FORMAT |
Output format: json (default), csv, parquet |
--function NAME |
Function name to invoke |
--args JSON |
Function arguments as JSON array (default: []) |
--worker PATH |
Worker command (default: vgi-fixture-worker) |
--type TYPE |
Function type: auto, table, table-in-out, scalar |
--projection-id N |
Column IDs to project (repeatable) |
--max-workers N |
Limit parallel workers |
--worker-stderr |
Show worker stderr output |
Table function (generates data):
# Generate a sequence of 100 integers
vgi-client --function sequence --args '[100]'
# Output as CSV
vgi-client --function sequence --args '[10]' --format csvTable-in-out function (transforms data):
# Echo input unchanged
vgi-client --input data.parquet --function echo
# Sum all numeric columns
vgi-client --input data.parquet --function sum_all_columns
# Repeat each row 3 times
vgi-client --input data.parquet --function repeat_inputs --args '[3]'Scalar function (per-row transform):
# Multiply values in column "price" by 2
vgi-client --input data.parquet --function multiply --args '["price", 2]' --type scalarOutput to file:
vgi-client --function sequence --args '[1000]' --output result.parquet --format parquetManage database catalogs exposed by VGI workers.
Most catalog operations require an attach ID. Two workflows are supported:
Explicit attach (recommended for stateful catalogs):
# Attach and capture the attach ID
ATTACH_ID=$(vgi-client catalog attach mydb --worker ./worker.py | jq -r '.attach_opaque_data')
# Use attach ID for subsequent operations
vgi-client catalog schema list --attach-opaque-data $ATTACH_ID --worker ./worker.py
# Detach when done
vgi-client catalog detach $ATTACH_ID --worker ./worker.pyAuto-attach (for stateless catalogs):
# Specify catalog name instead of attach ID
vgi-client catalog schema list --catalog mydb --worker ./worker.pyList available catalogs from a worker.
vgi-client catalog list --worker ./worker.pyAttach to a catalog and get an attach ID.
vgi-client catalog attach <name> --worker <worker> [--options '{}']Output:
{
"attach_opaque_data": "a1b2c3d4",
"supports_transactions": true,
"catalog_version": 1
}Detach from a catalog.
vgi-client catalog detach <attach_opaque_data> --worker <worker>Create a new catalog.
vgi-client catalog create <name> --worker <worker> \
[--on-conflict {error|ignore|replace}] \
[--options '{}']Drop a catalog.
vgi-client catalog drop <name> --worker <worker>Get the current catalog version.
vgi-client catalog version --catalog <name> --worker <worker>Manage schemas within a catalog.
List all schemas in a catalog.
vgi-client catalog schema list --catalog <name> --worker <worker>Get schema details.
vgi-client catalog schema get <schema_name> --catalog <name> --worker <worker>Create a new schema.
vgi-client catalog schema create <schema_name> \
--catalog <name> --worker <worker> \
[--comment "Description"] \
[--tags '{"key": "value"}']Drop a schema.
vgi-client catalog schema drop <schema_name> \
--catalog <name> --worker <worker> \
[--ignore-not-found] [--cascade]List all objects in a schema.
vgi-client catalog schema contents <schema_name> --catalog <name> --worker <worker>Manage tables within a schema.
Get table details.
vgi-client catalog table get <schema> <table> --catalog <name> --worker <worker>Create a new table.
vgi-client catalog table create <schema> <table> \
--catalog <name> --worker <worker> \
--columns '[{"name": "id", "type": "int64"}, {"name": "name", "type": "string"}]' \
[--not-null 0] \
[--unique "0,1"] \
[--check "id > 0"] \
[--on-conflict {error|ignore|replace}]Supported column types:
| Category | Types |
|---|---|
| Integer | int8, int16, int32, int64, uint8, uint16, uint32, uint64 |
| Float | float16, float32, float64 |
| String | string, utf8, large_string, binary, large_binary |
| Boolean | bool, boolean |
| Date | date32, date64 |
| Timestamp | timestamp, timestamp_s, timestamp_ms, timestamp_us, timestamp_ns |
| Duration | duration, duration_s, duration_ms, duration_us, duration_ns |
| Time | time32, time64 |
Drop a table.
vgi-client catalog table drop <schema> <table> \
--catalog <name> --worker <worker> \
[--ignore-not-found]Rename a table.
vgi-client catalog table rename <schema> <old_name> <new_name> \
--catalog <name> --worker <worker>Set or clear table comment.
# Set comment
vgi-client catalog table comment <schema> <table> \
--catalog <name> --worker <worker> \
--set "Table description"
# Clear comment
vgi-client catalog table comment <schema> <table> \
--catalog <name> --worker <worker> \
--clearGet the scan function for a table.
vgi-client catalog table scan-function <schema> <table> \
--catalog <name> --worker <worker>Modify table columns.
Add a column to a table.
vgi-client catalog table column add <schema> <table> \
--catalog <name> --worker <worker> \
--column '{"name": "email", "type": "string"}' \
[--if-not-exists]Drop a column from a table.
vgi-client catalog table column drop <schema> <table> <column> \
--catalog <name> --worker <worker> \
[--if-exists] [--cascade]Rename a column.
vgi-client catalog table column rename <schema> <table> <old_name> <new_name> \
--catalog <name> --worker <worker>Set column default value.
vgi-client catalog table column set-default <schema> <table> <column> "0" \
--catalog <name> --worker <worker>Remove column default value.
vgi-client catalog table column drop-default <schema> <table> <column> \
--catalog <name> --worker <worker>Change column type.
vgi-client catalog table column set-type <schema> <table> \
--catalog <name> --worker <worker> \
--column '{"name": "count", "type": "int64"}' \
[--using "CAST(count AS int64)"]Set or remove NOT NULL constraint.
vgi-client catalog table column set-not-null <schema> <table> <column> \
--catalog <name> --worker <worker>
vgi-client catalog table column drop-not-null <schema> <table> <column> \
--catalog <name> --worker <worker>Manage views within a schema.
Get view details.
vgi-client catalog view get <schema> <view> --catalog <name> --worker <worker>Create a view.
vgi-client catalog view create <schema> <view> \
--catalog <name> --worker <worker> \
--definition "SELECT id, name FROM users WHERE active = true" \
[--on-conflict {error|ignore|replace}]Drop a view.
vgi-client catalog view drop <schema> <view> \
--catalog <name> --worker <worker> \
[--ignore-not-found]Rename a view.
vgi-client catalog view rename <schema> <old_name> <new_name> \
--catalog <name> --worker <worker>Set or clear view comment.
vgi-client catalog view comment <schema> <view> \
--catalog <name> --worker <worker> \
--set "View description"Manage transactions for catalogs that support them.
Begin a new transaction.
TX_ID=$(vgi-client catalog transaction begin \
--attach-opaque-data $ATTACH_ID --worker <worker> | jq -r '.transaction_opaque_data')Commit a transaction.
vgi-client catalog transaction commit $TX_ID \
--attach-opaque-data $ATTACH_ID --worker <worker>Rollback a transaction.
vgi-client catalog transaction rollback $TX_ID \
--attach-opaque-data $ATTACH_ID --worker <worker># Attach to catalog
ATTACH_ID=$(vgi-client catalog attach mydb --worker ./worker.py | jq -r '.attach_opaque_data')
# Begin transaction
TX_ID=$(vgi-client catalog transaction begin \
--attach-opaque-data $ATTACH_ID --worker ./worker.py | jq -r '.transaction_opaque_data')
# Make changes within transaction
vgi-client catalog table create main users \
--attach-opaque-data $ATTACH_ID --transaction-opaque-data $TX_ID --worker ./worker.py \
--columns '[{"name":"id","type":"int64"}]'
# Commit or rollback
vgi-client catalog transaction commit $TX_ID \
--attach-opaque-data $ATTACH_ID --worker ./worker.py
# Detach
vgi-client catalog detach $ATTACH_ID --worker ./worker.pyAll workers that use Worker.main() (including vgi-fixture-worker) support
logging options on the command line. Logs are written to stderr.
| Option | Description |
|---|---|
--debug |
Enable DEBUG level on all vgi and vgi_rpc loggers |
--log-level LEVEL |
Set log level: DEBUG, INFO (default), WARNING, ERROR |
--log-logger NAME |
Target specific logger(s) instead of all defaults (repeatable) |
--log-format FORMAT |
Stderr format: text (default) or json |
--quiet / -q |
Suppress the interactive-terminal startup warning |
--debug overrides --log-level when both are provided.
# Enable debug logging
vgi-fixture-worker --debug
# Set WARNING level only
vgi-fixture-worker --log-level WARNING
# Target a specific logger at DEBUG
vgi-fixture-worker --log-level DEBUG --log-logger vgi.worker
# JSON-formatted logs (for structured log pipelines)
vgi-fixture-worker --log-format json| Logger | Description |
|---|---|
vgi |
VGI root logger (all VGI messages) |
vgi.worker |
Worker lifecycle (startup, shutdown) |
vgi.client |
Client operations (spawn, bind, exchange) |
vgi.client.cli |
CLI front-end (argument parsing) |
vgi.filter_pushdown |
Filter pushdown debug (deserialization/evaluation) |
vgi_rpc |
vgi_rpc root logger (all vgi_rpc messages) |
vgi_rpc.wire.request |
RPC wire request (serialised request bytes) |
vgi_rpc.wire.response |
RPC wire response (serialised response bytes) |
vgi_rpc.wire.transport |
Transport layer (pipe/HTTP transport debug) |
| Variable | Description |
|---|---|
VGI_QUIET=1 |
Suppress the interactive-terminal startup warning (same as --quiet) |
VGI_FILTER_DEBUG=1 |
Enable filter pushdown debug logging |
VGI_BEARER_TOKENS |
Comma-separated token=principal pairs for static bearer auth (HTTP only) |
VGI_JWT_ISSUER |
JWT issuer URL for JWT/JWKS auth (requires vgi[oauth] extra) |
VGI_JWT_AUDIENCE |
JWT audience string, comma-separated for multiple audiences (required when VGI_JWT_ISSUER is set) |
VGI_JWT_JWKS_URI |
JWKS endpoint URL (auto-discovered if omitted) |
VGI_OAUTH_RESOURCE |
OAuth resource URL for RFC 9728 metadata |
VGI_OAUTH_AUTH_SERVERS |
Comma-separated authorization server URLs |
VGI_OAUTH_CLIENT_ID |
Client ID for MCP compatibility (optional, URL-safe chars only) |
VGI_OTEL_ENABLED |
Enable OpenTelemetry instrumentation (1/true/yes) |
VGI_OTEL_CUSTOM_ATTRIBUTES |
Comma-separated key=value pairs for custom span/metric attributes |
VGI_OTEL_CLAIM_ATTRIBUTES |
Comma-separated claim_key=span_attr_name pairs for claim extraction |
VGI_OTEL_DISABLE_TRACING |
Disable tracing only (1/true/yes) |
VGI_OTEL_DISABLE_METRICS |
Disable metrics only (1/true/yes) |
Note: Service name, exporters, and endpoints are configured via standard
OTEL_*SDK env vars (e.g.OTEL_SERVICE_NAME,OTEL_EXPORTER_OTLP_ENDPOINT).
OTEL usage examples:
# Enable OTEL with standard SDK configuration
VGI_OTEL_ENABLED=1 \
OTEL_SERVICE_NAME=my-vgi-worker \
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 \
vgi-serve my_worker.py --http
# With custom attributes and claim extraction
VGI_OTEL_ENABLED=1 \
VGI_OTEL_CUSTOM_ATTRIBUTES="deployment=prod,region=us-east-1" \
VGI_OTEL_CLAIM_ATTRIBUTES="tenant_id=rpc.vgi_rpc.auth.claim.tenant_id" \
vgi-serve my_worker.py --httpProgrammatic usage:
from vgi_rpc.otel import OtelConfig
from vgi.serve import create_app, load_worker_class
app = create_app(
load_worker_class("my_worker:MyWorker"),
otel_config=OtelConfig(
custom_attributes={"deployment": "prod"},
claim_attributes={"tenant_id": "rpc.vgi_rpc.auth.claim.tenant_id"},
),
)Runs the built-in example worker with demo functions.
vgi-fixture-workerAvailable functions:
| Function | Type | Description |
|---|---|---|
echo |
table-in-out | Pass through input unchanged |
sum_all_columns |
table-in-out | Sum all numeric columns |
repeat_inputs |
table-in-out | Repeat each row N times |
buffer_input |
table-in-out | Collect all input, emit on finalize |
sequence |
table | Generate sequence of integers |
double_sequence |
table | Generate sequence of floats |
nested_sequence |
table | Generate sequence with nested struct/list columns |
partitioned_sequence |
table | Generate sequence across multiple workers |
projected_data |
table | Generate data with projection pushdown |
ten_thousand |
table | Generate 10000 integers |
constant_columns |
table | Generate rows with constant values from varargs |
named_params_echo |
table | Echo named parameter values in output columns |
multiply |
scalar | Multiply values by a constant factor |
double |
scalar | Double numeric values |
add_values |
scalar | Add two columns together |
sum_values |
scalar | Sum multiple numeric values (varargs) |
upper_case |
scalar | Convert string values to uppercase |
null_handling |
scalar | Returns value or -5000 if null |
random_int |
scalar | Generate random integers (VOLATILE) |
bernoulli |
scalar | Generate random booleans (VOLATILE) |
random_bytes |
scalar | Generate pseudo-random binary blobs |
The mutable catalog demo (vgi/examples/catalog.py) is no longer installed as a
console script. Run it from a source checkout via:
python -m vgi._test_fixtures.catalogLine-delimited JSON, one record per line:
vgi-client --function sequence --args '[3]' --format json{"n": 0}
{"n": 1}
{"n": 2}CSV with headers:
vgi-client --function sequence --args '[3]' --format csvn
0
1
2Binary Apache Parquet format (requires output file):
vgi-client --function sequence --args '[1000]' --format parquet --output data.parquet# Generate data and process it
vgi-client --function sequence --args '[100]' --format parquet --output /tmp/data.parquet
vgi-client --input /tmp/data.parquet --function sum_all_columns# Extract specific fields
vgi-client catalog attach mydb --worker ./worker.py | jq -r '.attach_opaque_data'
# Pretty print
vgi-client --function sequence --args '[3]' | jq .#!/bin/bash
WORKER="./my_worker.py"
# Attach
ATTACH_ID=$(vgi-client catalog attach mydb --worker $WORKER | jq -r '.attach_opaque_data')
# List schemas
vgi-client catalog schema list --attach-opaque-data $ATTACH_ID --worker $WORKER
# Cleanup
vgi-client catalog detach $ATTACH_ID --worker $WORKER