> For the complete documentation index, see [llms.txt](https://upsolver.gitbook.io/content/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://upsolver.gitbook.io/content/reference-1/sql-commands/jobs/create-job/transformation/select.md).

# SELECT

## Syntax

{% code overflow="wrap" %}

```sql
SELECT { <expression> [ [AS] <field_name>[::<column_type>] ] 
       | CAST(<expression> AS <column_type>) } [,...]
    FROM { <catalog>.<schema>.<resource>
         | <identifier> } 
              [ [AS] <alias> ]
    [ LET <identifier> = <expression> [, ...] ]
    WHERE time_filter() [ AND <where_filters> ]
    [ GROUP BY { <field_name> | <ordinal_position> } [, ...] ]
    [ LET <identifier> = <expression> [, ...] ]
    [ HAVING <having_filters> ]
    [ LIMIT <integer> ]
```

{% endcode %}

* `WHERE TIME_FILTER()`- Default behavior depending on the context.
* `WHERE TIME_FILTER($event_time, interval '5' minute)`- Specifies a 5-minute window based on `$event_time`.
* `WHERE TIME_FILTER($event_time, interval '1' minute, timestamp '2023-04-04 12:20:00')` - Queries rows ingested between 12:19:00 and 12:20:00 based on `$event_time`.

In most situations, the default values should suffice, resulting in job definitions with a simple `WHERE` condition like this:

```sql
CREATE SYNC JOB my_job
AS INSERT INTO <target>
SELECT * 
  FROM <source>
  WHERE time_filter()
```

**Current Limitations**

* The `TIME_FILTER` function cannot be used outside the WHERE clause.
* It must be used as a standalone expression and cannot be used as input in other functions or operators.

{% hint style="warning" %}
Note: Prior to the introduction of the `time_filter()` function, a job's `WHERE` clause was required to use the following clause:&#x20;

`$commit_time between run_start_time() [ - INTERVAL '<integer>' <time_unit> ] AND run_end_time()`

The `time_filter()` function should be used in new jobs instead. However, you might see the old syntax in templates or older jobs in your account.
{% endhint %}

#### Aggregated outputs

When aggregating data using Upsolver, the `windowSize` parameter in the `TIME_FILTER` function is significant, determining the range of source data that will be included in each aggregation.

By default aggregated outputs use the `RUN_INTERVAL` as the `windowSize`. This means that each execution of the job will aggregate just the data since the last execution. You may want to aggregate larger ranges of data but output them more frequently.&#x20;

#### Example

Let's say we want a job to aggregate data from the last 1 hour, but we want it to output that aggregation result every 5 minutes. To achieve this we can set a `RUN_INTERVAL` of **5 minutes** in our job options and a `windowSize` of **60 minutes** in our `time_filter()`:

```sql
CREATE SYNC JOB my_job
AS INSERT INTO <target>
SELECT COUNT(*) 
  FROM <source>
  WHERE time_filter($event_time, interval 60 minutes)
```

## `GROUP BY`

The `GROUP BY` clause specifies the column(s) used to group records together. It is typically used in conjunction with some [aggregate function(s)](/content/reference-1/functions-and-operators/functions/aggregate.md) within the selected columns.

Note that all non-aggregate columns selected must be present in the `GROUP BY` clause.

Additionally, all partition columns should appear within the `GROUP BY` clause as mapping an aggregated column to a partition column is not allowed.

## `HAVING`

The `HAVING` clause can be used to filter out records based on aggregate columns.

## `LIMIT`

The `LIMIT` clause can be used to limit the amount of records written to the target location.

This can be helpful when previewing the output of your job.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://upsolver.gitbook.io/content/reference-1/sql-commands/jobs/create-job/transformation/select.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
