Predictive Query Structure
A Predictive Query defines a predictive modeling task in Kumo using PQL (Predictive Query Language), a SQL-like syntax that specifies:
- Target – What you want to predict.
- Entity – Who you are making predictions for.
- Filters (optional) – Constraints on which entities or data to include.
Target
The target is the outcome you want to predict, defined after the PREDICT
command.
For example, to predict total purchases per user over the next 30 days, the target is “sum of purchases over the next 30 days.”
Entity
The entity is the subject of your prediction—who the prediction is being made for.
For example, if predicting total purchases per user, then the user is the entity.
Aggregation Operators
When predicting an aggregation over time (e.g., total sales over 30 days), use an aggregation function with a column reference.
Example: Predicting Total Purchase Value per Customer
SUM(TRANSACTIONS.PRICE, 0, 30)
→ Sums purchase values over the next 30 days.FOR EACH CUSTOMERS.CUSTOMER_ID
→ Predicts for each customer.
The above usage of the SUM()
aggregation operator allows you to predict the total number of sales each customer will make in the next 30 days.
Within the aggregation function inputs, the start and end parameters refer to the time period you want to aggregate across, calculated in days. For example, 10 for start and 30 for end implies that you want to aggregate from 10 days later (excluding the 10th day) to 30 days later (including the 30th day). The time unit of the aggregation defaults to ‘days’ if none is specified.
If you’re making the prediction on 2020-01-01 00:00:00
, Kumo will aggregate all rows with timestamps t where 2020-01-11 00:00:00 < t <= 2020-01-31 00:00:00
.
When using aggregation with targets, both start and end values should be non-negative integers, and end values should be greater than start values.
Common Aggregation Functions
SUM()
– Total value over time.COUNT()
– Number of occurrences over time.
For a complete list of aggregation functions and further details, please refer to Aggregation Operators
Aggregation Window (Start & End)
- The start and end parameters define the prediction window in days.
- If the prediction date is
2020-01-01
:10, 30
will predict transactions values from2020-01-11 to 2020-01-31
.
Aggregation Units
The time unit defaults to days, but can also be:
days
(default)months
hours
Filters (WHERE
)
Filters refine a Predictive Query by removing irrelevant entities or restricting aggregation conditions.
For example, to predict purchases for active customers only (i.e., those who made at least one transaction in the past 30 days):
Kumo supports advanced filtering, including:
- Inline filters inside aggregations
- Nested temporal filters
- Static date/time filters
- Multiple target conditions (
AND
/OR
)
For a complete guide to filtering, see Predictive Query Reference.