AI processor
Uses generative AI functions from your third-party target data platform to transform the data.
The AI processor uses native generative AI model capabilities in your third-party target data platform. You can for example translate, classify, or summarize data in your transformation.
Availability
AI processor capability is available for projects using the following target data platforms.
-
Snowflake
-
Databricks
Snowflake
AI processor capability is available for projects with Snowflake as target data platform, using Snowflake Cortex AI APIs.
The following functions are available:
-
Analyze sentiment
-
Classify
-
Summarize
-
Translate
For more information about the functions, see Snowflake documentation: Available functions.
Databricks
AI processor capability is available for projects with Databricks as target data platform, using Databricks Foundation Model APIs. This Databricks capability is in Public Preview and can contain limitations, for example it is not supported on Databricks SQL Classic. For information about function specific limitations, see the link to Databricks documentation for each respective function.
See also Databricks Previews support & details.
The following functions are available. Select which AI function to use in Function name.
-
Analyze sentiment
-
Classify
-
Fix grammar
-
Mask
-
Similarity
-
Summarize
-
Translate
For more information about the functions, see Databricks documentation: Alphabetical list of built-in functions.
Analyze sentiment
Perform sentiment analysis on input text.
Available in: Databricks, Snowflake
Input
Property name | Configuration |
---|---|
Content |
Select the column you want to perform sentiment analysis on. You can only select columns of string type. |
Output column name |
Enter a name for the generated output column. The expected format is the following:
Example: ASDasd123_4564 |
Limit for preview | Set the number of rows to load in data preview. The default value is 10. If you set this to 0 there is no limit. |
Output
Target data platform | Configuration |
---|---|
Databricks |
The sentiment is returned as a text string with the value of positive, negative, neutral, or mixed. If the sentiment cannot be detected, null is returned. |
Snowflake |
The sentiment is returned as a score between -1 to 1 for the given English-language input text. -1 corresponds to the most negative sentiment, and 1 to the most positive sentiment. Values around 0 correspond to a neutral sentiment. |
Classify
Classify input text according to labels you provide.
Available in: Databricks, Snowflake
Input
Property name | Configuration |
---|---|
Content |
Select the column you want to classify text for. You can only select columns of string type. |
Classification labels |
Add labels to use when classifying the data. Use to add more labels. You can use from 2 to 20 labels. |
Limit for preview | Set the number of rows to load in data preview. The default value is 10. If you set this to 0 there is no limit. |
Output
Target data platform | Configuration |
---|---|
Databricks |
A text string is returned with the classification label matching the input string in Content. |
Snowflake |
A text string is returned with the classification label matching the input string in Content. |
Fix grammar
Correct grammatical errors in a text column.
Available in: Databricks
Input
Property name | Configuration |
---|---|
Content |
Select the column you want to fix grammar in. You can only select columns of string type. |
Output column name |
If you select Create a new column, you can enter a name for the generated output column. The expected format is the following:
Example: ASDasd123_4564 |
Limit for preview | Set the number of rows to load in data preview. The default value is 10. If you set this to 0 there is no limit. |
Output
Target data platform | Configuration |
---|---|
Databricks |
A text string is returned with the grammar corrected. |
Mask
Mask specified entities in a text column. Masked entities are replaced with [MASKED].
Available in: Databricks
Input
Property name | Configuration |
---|---|
Content |
Select the column you want to mask text entities in. You can only select columns of string type. |
Mask labels | Add a label for each text entity that you want to mask. Use to add more labels. |
Output column name |
If you select Create a new column, you can enter a name for the generated output column. The expected format is the following:
Example: ASDasd123_4564 |
Limit for preview | Set the number of rows to load in data preview. The default value is 10. If you set this to 0 there is no limit. |
Output
Target data platform | Configuration |
---|---|
Databricks |
A text string is returned with the specified entities masked. |
Similarity
Compare two strings and computes the semantic similarity score.
Available in: Databricks
Input
Property name | Configuration |
---|---|
Content |
Select the column you want to compare. You can only select columns of string type. |
With |
You can compare the text in Content with text from another string column or a value that you specify.
|
Output column name |
Enter a name for the generated output column. The expected format is the following:
Example: ASDasd123_4564 |
Limit for preview | Set the number of rows to load in data preview. The default value is 10. If you set this to 0 there is no limit. |
Output
Target data platform | Configuration |
---|---|
Databricks |
The score is returned as a float value between 0 and 1.0, where 1.0 means that the strings are equal. |
Summarize
Generate a summary of the text in a text column.
Available in: Databricks, Snowflake
Input
Property name | Configuration |
---|---|
Content |
Select the column you want to summarize. You can only select columns of string type. |
Max word count |
Set the maximum word count of the text summary. You can only set integer values. The default value is 50. If you leave it empty or set it to zero, the maximum word count is not applied. Information noteThis option is only available in Databricks.
|
Output column name |
Enter a name for the generated output column. The expected format is the following:
Example: ASDasd123_4564 |
Limit for preview | Set the number of rows to load in data preview. The default value is 10. If you set this to 0 there is no limit. |
Output
Target data platform | Configuration |
---|---|
Databricks |
A text string is returned with a summary of the input string in Content. |
Snowflake |
A text string is returned with a summary of the input string in Content. |
Translate
Translates the text content of a column.
Available in: Databricks, Snowflake
For information about supported languages, see reference documentation for the data platform.
Input
Property name | Configuration |
---|---|
Content |
Select the column you want to summarize. You can only select columns of string type. |
Translate from |
Select the language to translate from. You can also select to have the language auto-detected. Available in: Snowflake |
Translate to |
Select the language to translate to. |
Output column name |
Enter a name for the generated output column. The expected format is the following:
Example: ASDasd123_4564 |
Limit for preview | Set the number of rows to load in data preview. The default value is 10. If you set this to 0 there is no limit. |
Output
Target data platform | Configuration |
---|---|
Databricks |
A text string is returned with a translation of the input string in Content. |
Snowflake |
A text string is returned with a translation of the input string in Content. |
Editing the processor
To rename the processor, click the Edit icon that is displayed when hovering over the default name of the processor.
To edit its description, click the Edit icon that is displayed when hovering over Description.