Managing dataset metadata
Catalog provides several layers of key descriptive and technical metadata about datasets. This information assists in the organization and assignment of resources and access.
App developers use profile statistics and data sampling to gain ideas and direction for creating apps and planning visualizations. Field profiling can help data analysts and business users to gain insights faster without having to create an app first. Whether you are a data administrator or a data consumer; knowing the provenance of your datasets and trusting in the accuracy of metadata increases trust and confidence when analyzing data assets.
Permissions to view datasets and metadata
Permissions are required in a space to view datasets and dataset metadata. Both actions map to the permission List and use data source in the space. For more information, see Managing permissions in shared spaces or Managing permissions in managed spaces.
- View datasets > List and use data source
- View metadata > List and use data source
Dataset overview
The dataset Overview tab provides a summary of descriptive and technical metadata about your datasets.
The overview tab captures:
- Technical metadata such as size, owner, file type, and created, last modified, and metadata refresh timestamps. Metatags that have been applied to the dataset display above this information.
- Classifications are applied to datasets to associate them with user-defined logical subject areas.
Do the following:
-
From the Home tab in Qlik Cloud Analytics, select the Catalog icon on the left hand navigation bar; or from the Catalog tab, filter on Types: Data.
When hovering over a dataset tile, the data file extension icon (example: .XLSX) changes to an Open dataset button. The source file name displays below it. Datasets inherit the name of the original data file and can be edited.
-
Select Open dataset to display an Overview of that dataset.
The heading of the dataset provides the following metadata:
Metadata | Description |
---|---|
Type | The type of dataset, such as delimited or XLSX. |
Rows | Number of records in the dataset. |
Fields | Number of columns in the dataset. |
Size | File or content size (example: 9.51 MiB) |
Details provides the following metadata:
Metadata | Description |
---|---|
Tag | Metatags that have been applied display (example: tag1, tier3, upgrade, and so on). |
Source | Original data resource name (example: MyVolumes.txt). |
Profile refreshed | Timestamp of latest refresh of metadata derived from the source of the dataset, such as the profile, the number of records, and the number of columns. |
Metadata modified | Timestamp of the last modification made (example: Feb 18, 2022 7:21 PM). This value changes when the following events occur: Reload, rename, change description, change owner, change script. |
Metadata created | Timestamp of dataset object creation (example: Feb 18, 2022 7:21 PM). |
Space | The linked name to the destination space . Depending on permissions, the space could be a Personal, Shared, Managed, or Data space. |
Owner | The owner of the content (example: JS Jan Smith). |
Creator | The creator of the content (example: JS Jan Smith). |
Used in |
The number of applications using a particular dataset. |
Viewed by |
The number of unique viewers over the last 28 days. |
Tagging datasets
Tags (also known as metatags) are applied by users to assist in locating and organizing data. Data contributors enter and apply free-form tags to datasets for improved search and categorization. This is a useful tool for data administrators who need to filter on particular types of data assets for many reasons, including allocation of cost center resources, segmentation of sales and marketing organizations, and permissions and governance strategy. App developers and data consumers use tags to identify datasets for improved efficiency and organization.
Metatags that have been applied to a dataset display directly above detailed metadata in the overview tab.
Applying metatags to datasets
Do the following:
-
Select Open dataset, then the menu, then select Edit dataset or from the menu on the tile, select Edit. A box appears where the dataset Name, Description, and Tags can be edited. Tags that have already been applied to the dataset appear in the list.
-
In the Tags box, enter tags made up of any character string (spaces and special characters are allowed with a limit of 31 characters across multiple tags). Enter each tag separately then Save the new tags. Individual tags can be deleted by selecting x on the tags.
Filtering on metatags
Do the following:
-
Open the Catalog tab if it is not open. Under the Types dropdown, select Data.
-
Select All filters to open the left-side Filters panel; scroll to the bottom and enter Tags on which to filter datasets.
Metadata refresh
Dataset metadata in the catalog adheres to a last-known state metadata management model. This model provides information and actions so that you will always know how up-to-date your derived metadata are. Derived metadata reflects the state of your data and is distinct from user- and system-controlled metadata.
Select the green refresh icon to refresh derived metadata. If no changes are detected and metadata is up-to-date, the refresh icon will appear gray. The fieldMetadata refresh date provides the time of the last derived metadata refresh.
Metadata refresh is initiated when changes to the schema are detected. If there is a change to the data—for example, if data is added or deducted—the refresh icon will turn green. If you select the icon, Modified date will change but Metadata refresh date will not change because there was no change to the derived metadata.
Derived metadata is refreshed at different times depending on whether the dataset is uploaded to Qlik Cloud or if it is an external dataset, whether it is newly registered or already in the system:
-
Data that is newly registered into catalog is automatically profiled upon import.
QVD and Parquet files must be manually profiled by clicking Profile dataset.
- Data already in the system without derived metadata may have never had profile metadata calculated. Opening an existing dataset that has not had a profile computed will trigger profiling. If there are updates to the file after this computation, the refresh icon will again appear green, indicating that the dataset could be refreshed by selecting the icon.
- When the system detects a change in the schema of a dataset table, the metadata refresh icon is green, indicating that the derived metadata could be refreshed to reflect the current state of the data.
- External resources will always display a green refresh icon. Select the metadata refresh icon to ensure that the derived metadata reflects the current state of the data.
Configuring dataset classifications
Classifications can be applied to datasets to associate them with specific user-defined subject areas. Classifications can be a valuable tool for identifying sensitive information or simply distinct subject areas.
Do the following:
-
From the dataset Overview tab, locate the Classifications section and select Add classification. Enter free-form text describing a Subject area to which this dataset belongs. If you want to apply a user-defined classification specific to an industry or use case, enter a description that will identify the dataset with that policy or grouping.
Permissions to apply classifications and tags
Permissions are required to edit and apply classifications and metatags. Look for the permission Edit and apply properties to data source in the space for details in these topics: see Managing permissions in shared spaces or Managing permissions in managed spaces.
Viewer and item usage metrics
Viewer and usage metrics allow you to quantify the value of your content at-a-glance by showing both the number and trend of unique viewers in the last 28 days (Viewed by) as well as the number of applications that are currently using a particular item (Used in).
Usage metrics are turned on by default in a tenant and can be turned off by a tenant administrator. See Displaying content usage metrics. If you do not see these statistics in your tenant, it is likely they are turned off.
Viewer metrics
An item's view count over the last 28 days is a good indication of its popularity. Knowing how many times an item has been viewed recently also helps content owners gain valuable insights into their work. For instance, an item that has been viewed by a relatively small number of users might indicate that the item is no longer useful or needs to be improved to increase its popularity.
Item usage metrics
You can see how many applications are using a particular item at any given time and easily drill-down further (by clicking the number) to view impact analysis. As items with a higher number of dependencies are usually of higher quality, an awareness of such dependencies provides a useful means of quantifying item quality. Data and analytics producers can then leverage higher quality items to create additional content, while content owners can determine the impact of any changes to the content.
Understanding metrics icons
The table below describes the icons that are used to display viewer and item usage metrics in Grid view. Hover over an icon to bring up a tooltip with more information. Note that the arrowhead icon used to indicate viewer trends is also shown in List view.
Details | Description |
---|---|
The number of unique viewers over the last 28 days. | |
The number of applications using a particular item. You can click the icon to view impact analysis. |
Metrics locations
Viewer and usage metrics are available in the locations listed below.
Below the data asset tile in Grid view:
Hovering over the metrics icons will show a tooltip with more information.
In dataset Overview tab (the default view when you open a dataset):
Hovering over the icon will show a tooltip with more information. Click the Used in number to view impact analysis.
In List view:
For any item, hovering over the Viewed by or Used in numbers will show a tooltip with more information. You can also click the number in the Used in column to view impact analysis. Note that the Used in metric is not shown for apps as it is only relevant for datasets.
Options
Select the menu for the following options:
- Add to Collection: Collections are organizations of objects in activity centers. Select this option to:
- Search for a collection
- Create collection
- Add to a collection
- Rename: Select to edit Name, Description, or Tags.
- Lineage: See Analyzing lineage for apps, scripts, and datasets
- Impact analysis: See Analyzing impact analysis for apps, scripts, and datasets
-
File format settings: See Uploading datasets and editing file format settings