> For the complete documentation index, see [llms.txt](https://upsolver.gitbook.io/content/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://upsolver.gitbook.io/content/reference-1/monitoring/system-catalog/task-executions-table.md).

# Task Executions Table

The system information in Upsolver is designed to help you to monitor and troubleshoot your jobs, by providing internal insights. Jobs are divided into various tasks, with each task responsible for working with data, performing maintenance work, and more. This section describes the task execution table to help understand your jobs.

The task executions table enables you to monitor the execution of the tasks that run your jobs and maintain your tables. To monitor and troubleshoot your jobs, run the following query:

```sql
SELECT * FROM logs.tasks.task_executions;
```

The following three sections describe the task executions table:&#x20;

1. [Task execution records](#task-run-records): This section includes a list of fields in the `task_executions` table. It includes the field name and data type, as well as a short description of how to interpret each value.
2. [Stage names](#stage-names): Upsolver operations comprise multiple stages that execute tasks to complete a job. This section describes each of these stages that can be found in the `stage_name` field. This can help you to better understand the progress of your jobs and identify the status of each stage.
3. [Task event types](#task-event-types): Each stage is a logical grouping of one or more tasks. This section describes the types of tasks that can be executed, along with a description of the task event types found in the `task_event_types` field.&#x20;

## Task execution records&#x20;

Each record within the `task_executions` table describes a task being executed. The following is a list of the fields for each task:

<table><thead><tr><th width="310">Field name</th><th width="155">Data type</th><th>Description</th></tr></thead><tbody><tr><td><code>cluster_name</code> </td><td><code>string</code></td><td>The name of the cluster that processed this task.</td></tr><tr><td><code>cluster_id</code></td><td><code>string</code></td><td>The unique ID of the cluster that processed this task.</td></tr><tr><td><code>cloud_server_name</code></td><td><code>string</code></td><td>The ID of the cloud instance this job is running on.</td></tr><tr><td><code>stage_name</code></td><td><code>string</code></td><td>Describes the type of task being executed.<br><br>For descriptions of the different stage names, see <a href="#stage-names">Stage names</a>.</td></tr><tr><td><code>job_name</code></td><td><code>string</code></td><td>The name of the job that the task belongs to.</td></tr><tr><td><code>job_id</code></td><td><code>string</code></td><td>The unique ID of the job that the task belongs to.</td></tr><tr><td><code>task_name</code></td><td><code>string</code></td><td>The name of the task formatted as the <code>job_id</code> with a prefix or suffix descriptor attached.</td></tr><tr><td><code>task_start_time</code></td><td><code>timestamp</code></td><td><p>The start time of the window of data being processed. In a transformation job, this corresponds to the value of <code>run_start_time()</code>.</p><p><br>The difference between the <code>task_start_time</code> and <code>task_end_time</code> corresponds to the <code>RUN_INTERVAL</code> configured within the job options for transformation jobs.</p><p><br>For data ingestion jobs, this defaults to 1 minute.</p></td></tr><tr><td><code>task_end_time</code></td><td><code>timestamp</code></td><td><p>The end time of the window of data being processed. This corresponds to the value of <code>run_end_time()</code> within transformation jobs.</p><p><br>The difference between the <code>task_start_time</code> and <code>task_end_time</code> corresponds to the <code>RUN_INTERVAL</code> configured within the job options for transformation jobs. </p><p><br>For data ingestion jobs, this defaults to 1 minute.</p></td></tr><tr><td><code>shard</code></td><td><code>bigint</code></td><td>The shard number corresponding to this task.</td></tr><tr><td><code>total_shards</code></td><td><code>bigint</code></td><td><p>The total number of shards used to process the job for this execution.<br></p><p>This corresponds to the value configured by the <code>EXECUTION_PARALLELISM</code> job option. If the value of <code>EXECUTION_PARALLELISM</code> is altered at any point, the <code>total_shards</code> for future tasks belonging to that job are updated to match.</p></td></tr><tr><td><code>task_start_processing_time</code></td><td><code>timestamp</code></td><td>The time the task started being processed.</td></tr><tr><td><code>task_end_processing_time</code></td><td><code>timestamp</code></td><td>The time the task finished being processed.</td></tr><tr><td><code>task_items_read</code></td><td><code>bigint</code></td><td>The total number of records read.</td></tr><tr><td><code>bytes_read</code></td><td><code>bigint</code></td><td>The total bytes ingested from the source data in its original form, including header information.</td></tr><tr><td><code>bytes_read_as_json</code></td><td><code>bigint</code></td><td><p>The total bytes ingested from the source data if it were in a JSON format.<br></p><p>This is the number used to determine the volume of data scanned for billing purposes.</p></td></tr><tr><td><code>duration</code></td><td><code>bigint</code></td><td><p>The time in milliseconds it took to process this task.<br></p><p>This is equivalent to the difference between the <code>task_start_processing_time</code> and <code>task_end_processing_time</code>.</p></td></tr><tr><td><code>task_delay_from_start</code></td><td><code>bigint</code></td><td><p>The delay in milliseconds between the end of the data window and when the task began processing.<br></p><p>This is equivalent to the difference between the <code>task_end_time</code> and <code>task_start_processing_time</code>.</p></td></tr><tr><td><code>task_classification</code></td><td><code>string</code></td><td>The classification of the task as <code>user</code>, <code>system</code>, <code>input</code>, or <code>metadata</code> based on the type of task being executed.</td></tr><tr><td><code>task_error_message</code></td><td><code>string</code></td><td>The error message, if an error is encountered.</td></tr><tr><td><code>task_event_type</code></td><td><code>string</code></td><td><p>Classifies the task into event types.</p><p></p><p>For descriptions of the different event types, see <a href="#task-event-types">Task event types</a>.</p></td></tr><tr><td><code>organization_name</code></td><td><code>string</code></td><td>The name of the organization that the task belongs to.</td></tr><tr><td><code>log_processing_time</code></td><td><code>timestamp</code></td><td>The time the log record was processed.</td></tr><tr><td><code>organization_id</code></td><td><code>string</code></td><td>The unique ID of your organization (the same as the organization name).</td></tr><tr><td><code>partition_date_str</code></td><td><code>string</code></td><td>The partition date as a string.</td></tr><tr><td><code>partition_date</code></td><td><code>date</code></td><td>The date column that the table is partitioned by. <br>Always qualify a <code>partition_date</code> filter in your queries to avoid full scans.</td></tr><tr><td><code>upsolver_schema_version</code></td><td><code>bigint</code></td><td>The system table's schema version. It changes when the user edits the output job that is written to this table.</td></tr></tbody></table>

## Stage names

This section describes each of these stages that can be found in the `stage_name` field.&#x20;

<table data-full-width="false"><thead><tr><th width="284">Stage name</th><th>Description</th></tr></thead><tbody><tr><td>file discovery</td><td>Discovers the files within a file-based data source such as Amazon S3, Azure Blob Storage, or Google Cloud Storage.</td></tr><tr><td>data ingestion</td><td>Pulls data from the data source.</td></tr><tr><td>parse data</td><td>Parses the data discovered during <strong>file discovery</strong> or <strong>data ingestion</strong> stage.</td></tr><tr><td>Ingestion state maintenance</td><td>Performs maintenance work when data is being ingested.</td></tr><tr><td>write to storage</td><td>Writes output to object store.</td></tr><tr><td>write to target</td><td>Writes the data to the target location.</td></tr><tr><td>cleanup</td><td><p>Deletes old files that are unnecessary.<br></p><p>This can be cleaning up unneeded files after compaction or removing other temporary files such as deleting batch files once the data has been parsed.</p></td></tr><tr><td>table state maintenance</td><td><p>Collects and maintains metadata about files as they are written to tables.<br></p><p>This metadata is later used to perform tasks such as maintaining the file system, running compactions, running queries, and more.</p></td></tr><tr><td>retention</td><td>Deletes old data and metadata that have passed the retention period as configured when the table was created.</td></tr><tr><td>build indices</td><td>Builds indices for materialized views by reading the raw data and creating small files for the data that are then compacted and merged together.</td></tr><tr><td>compact indices</td><td>Compacts indices for materialized views after they have been built.</td></tr><tr><td>aggregation</td><td>Builds and compacts indices to perform aggregation for aggregated outputs.</td></tr><tr><td>collect statistics</td><td>Gathers metadata from the ingestion or output job by generating indexes.</td></tr><tr><td>compact statistics</td><td>Compacts and merges the metadata index.</td></tr><tr><td>partition metadata</td><td>Processes metadata for partition management and maintenance.</td></tr><tr><td>partition maintenance</td><td>Creates new partitions and deletes old ones.</td></tr><tr><td>partition management</td><td>Creates new partitions and deletes old ones.</td></tr><tr><td>count distinct metadata</td><td>Collects the number of distinct values for a field.</td></tr><tr><td>event type metadata</td><td>Builds the metadata index for a field when an event type is set in Upsolver Classic. This allows us to filter by event type and show statistics per event type.</td></tr><tr><td>upsert metadata</td><td>Maintains metadata about primary keys in order to know how and where to perform updates when they arrive as events.</td></tr><tr><td>monitoring metadata</td><td>Ensures metadata is written successfully.</td></tr><tr><td>dedup index</td><td>Builds the deduplication index. This index is used to run <code>IS_DUPLICATE</code> calculations.</td></tr><tr><td>coordinate compaction</td><td>Coordinates partition compactions by checking available files. Simultaneously maintains other table metadata.</td></tr><tr><td>compaction</td><td>Compacts smaller files into larger ones to optimize query performance when writing to a data lake output.</td></tr><tr><td>upsert compaction</td><td>Compacts data from multiple files to delete old rows that have a newer update.</td></tr><tr><td>compaction state maintenance</td><td>Performs maintenance work to ensure compaction state is healthy.</td></tr><tr><td>maintenance</td><td>Performs general maintenance tasks.</td></tr><tr><td>internal task</td><td>Performs tasks for working with connections to external environments.</td></tr></tbody></table>

#### Task event types

The following table describes the types of tasks that can be executed, along with a description of the task event types found in the `task_event_types` field.

<table><thead><tr><th width="254">Event type</th><th>Description</th></tr></thead><tbody><tr><td>started</td><td>The task has started.</td></tr><tr><td>finished</td><td>The task has completed successfully .</td></tr><tr><td>heartbeat</td><td>An indicator that the task is still running. This is sent every 5 minutes to determine if a task is long-running and the current state of the task (so it has the current duration, read bytes and etc).</td></tr><tr><td>canceled</td><td>The task was canceled.</td></tr><tr><td>no-resources</td><td>Indicates a lack of resources to start a task. This is usually due to a connection limitation.</td></tr><tr><td>failed</td><td>The task has failed. Check <code>task_error_message</code> to better understand the error encountered.</td></tr><tr><td>failed-build</td><td>Failed to build a task.</td></tr><tr><td>failed-recoverable</td><td>An intermittent error has occurred (e.g. reading a file that was modified while reading it). The task will retry and recover from the error and the resulting data will be consistent.</td></tr><tr><td>dry-run-failed</td><td>The task from Upsolver's automated testing process; the testing of a new version has failed.</td></tr><tr><td>ignored-dry-run-failure</td><td>The dry run is ignored due to false positives.</td></tr></tbody></table>


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://upsolver.gitbook.io/content/reference-1/monitoring/system-catalog/task-executions-table.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
