Designing your data integration project

Qlik Talend Data Integration provides the ability to create data pipelines to perform a variety of data integration tasks in support of your data architecture and analytics requirements.

You create your data integration flow in a project, using data tasks. The project is associated with a data platform that is used as target for all output. The project is stored in a data space. You must also create connections to your sources and targets.

Create a space

Working in spaces in Data Integration

Create a data space that is used to create and store your project. Inside the space, you can also create new connections using connectors, and manage access to Data Movement gateways.

Create connections to your sources and targets

Setting up connections to data sources

Create connections to your data sources.

Setting up connections to targets

Create connections to your target platform.

Qlik Data Gateway - Data Movement

Set up Qlik Data Gateway - Data Movement to facilitate secure data movement from your enterprise data sources and SaaS applications to supported targets.

Create a project

When you create a project, you must select your use case.

Creating a data pipeline project

Data pipeline projects support ingesting data from a large number of supported sources to a data platform where you can then transform data with ELT (pushdown) transformations to support data lakehouse and data warehouse architectures. Pipelines support log-based CDC and incremental data sources but provide a series of options for data ingestion to major data warehouse platforms.

Use a data pipeline project when you want to:

Support type 1 and type 2 data structures with your ingestion processes.
Transform and re-shape your data to of fit-for-purpose output or star-schemas for analytic workloads.
Create a lakehouse based on Iceberg.
Create a Qlik Open Lakehouse based on Apache Iceberg and mirror tables to Snowflake.
Create complex pipelines that are managed across projects for organizational or functional boundaries.

Creating a replication project

Replication projects support direct replication from a large number of supported sources to data lakes or any supported target platforms. Data is applied directly to your target structures but complex transformations or reshaping of data is not supported. Replication pipelines support a larger set of target technologies for replication scenarios.

Use a replication project when you want to:

Replicate data to your target and do not need complex transformations on that data.
Replicate data to a target not supported by data pipelines.

Manage versions of your pipeline project

Manage your projects with version control

Use version control to manage development of a data project, and to keep track of changes.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!

Leave your feedback here