Early Access: The content on this website is provided for informational purposes only in connection with pre-General Availability Qlik Products.
All content is subject to change and is provided without warranty.
Skip to main content Skip to complementary content

Assessing data quality

After opening a dataset, you can take a look at several parts of the overview to learn more about its overall quality, its schema, the quality statistics, and semantic types of each columns.

Information noteYou need a Qlik Talend Cloud Enterprise subscription.

Quality indicators of the dataset

When you open the overview of a dataset that has just been registered, most of the information is grayed out. To calculate the data quality for the first time, click the Compute button. If the quality has already been computed once before, but you want to make sure that the data is up to date, click the Refresh button.

Each compute or refresh in pushdown will induce some costs in your Cloud data warehouse (Snowflake or Databricks). For more information, see Data quality for connection-based datasets.

There are two main sections where the quality is displayed.

  • The Data quality area, that includes:

    • The repartition of valid, invalid, and empty values across the whole dataset in the form of a quality bar with three colors, and their respective percentages.

    • A Validity score, expressing the percentage of valid values, without taking empty values into account.

    • A Completeness score, expressing the percentage of values that are not empty.

  • The Schema area that shows the different fields of the dataset, wihch data type or semantic type has been applied, and a quality bar for each field of the dataset.

Tip noteFor connection-based datasets, if the schema and quality of the dataset fails to be retrieved, check if the connection you have set up in the Qlik Analytics Services hub has the Role field properly filled, or if the role itself grants the necessary permissions on the database table.

Semantic types discovery

Each column of a dataset is automatically assigned a semantic type to better describe its content. Behind the scenes a data discovery operation occurs to determine which type to assign.

You can also create semantic types and manage the values in each semantic type.

For more information, see Managing semantic types.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!