> For the complete documentation index, see [llms.txt](https://upsolver.gitbook.io/content/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://upsolver.gitbook.io/content/reference-1/monitoring/system-catalog/insights/dataset_column_stats.md).

# dataset\_column\_stats

### Overview

The `system.insights.dataset_column_stats` table offers comprehensive statistics about columns within datasets in your catalog. This table serves as a tool for data profiling, query performance optimization, and monitoring schema changes, all powered by Upsolver's real-time stream processing engine.

### Columns

The following table describes the columns contained in the  `system.insights.dataset_column_stats` system table:

<table><thead><tr><th width="278">Column Name</th><th width="133.33333333333331">Data Type</th><th>Description</th></tr></thead><tbody><tr><td><code>catalog</code></td><td>STRING</td><td>The name of the catalog where the dataset resides.</td></tr><tr><td><code>dataset</code></td><td>STRING</td><td>The name of the dataset.</td></tr><tr><td><code>column_name</code></td><td>STRING</td><td>The name of the column.</td></tr><tr><td><code>column_type</code></td><td>STRING</td><td>The data type of the column (e.g., STRING, BIGINT, etc.).</td></tr><tr><td><code>density</code></td><td>INT</td><td>The density of data in the column.</td></tr><tr><td><code>density_in_parent</code></td><td>INT</td><td>Density of this column relative to its parent, if applicable.</td></tr><tr><td><code>total_count</code></td><td>INT</td><td>Total number of records in the column.</td></tr><tr><td><code>min_distinct_values</code></td><td>INT</td><td>Minimum number of distinct values found in the column.</td></tr><tr><td><code>max_distinct_values</code></td><td>INT</td><td>Maximum number of distinct values found in the column.</td></tr><tr><td><code>values_appear_unique</code></td><td>BOOLEAN</td><td>A Boolean flag that indicates if values appear to be unique based on the current statistics. <strong>Note</strong>: The value may not be unique due to infrequent repetitions.</td></tr><tr><td><code>top_values</code></td><td>ARRAY</td><td>An array of the most frequent values along with their counts.</td></tr><tr><td><code>min_value</code></td><td>VARIANT</td><td>Smallest value in the column.</td></tr><tr><td><code>max_value</code></td><td>VARIANT</td><td>Largest value in the column.</td></tr><tr><td><code>length_distribution</code></td><td>ARRAY</td><td>An array representing the distribution of the length of values in the column.</td></tr><tr><td><code>value_distribution</code></td><td>ARRAY</td><td>An array representing the distribution of values in the column.</td></tr><tr><td><code>first_seen</code></td><td>TIMESTAMP</td><td>Timestamp when the data in the column was first seen.</td></tr><tr><td><code>last_seen</code></td><td>TIMESTAMP</td><td>Timestamp when the data in the column was last seen.</td></tr></tbody></table>


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://upsolver.gitbook.io/content/reference-1/monitoring/system-catalog/insights/dataset_column_stats.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
