Bicep - Deploying an Azure Data Factory
Happy New Year, everyone! To start the year, I’ll show you how I use Bicep to deploy and configure Azure Data Factory, walking you through the essential settings to ensure a secure and efficient setup. We’ll explore configurations specific to Data Factory and general settings that you can apply when deploying other Azure resources using Bicep.
Here’s what we’ll explore:
- Name and location: Here we specify where and how Data Factory will be called.
- Identity: This parameter allows us to have Data Factory use an identity assigned by the system or by the user for authentication.
- Encryption: Protects data by linking it to Azure Key Vault for secure key management.
- Global parameters: Here we have the possibility to add configurations that we will use for the pipelines in our Data Factory.
- Public network access: This parameter allows us to decide whether to allow access to Data Factory over the Internet.
- Integration: Here we can link an existing Azure Purview account for better data governance.
- Source control: We can link an Azure DevOps repository for version control and collaboration.
- Tags: As with all resources, having tags helps us organize and identify the resources.
- Monitoring: We will activate the logging and metrics of the deployed resource in order to monitor its activity.
- Lock: We will also deploy a “security lock” to our Data Factory to prevent accidental deletions.
By the end of this guide, you’ll have a fully configured Azure Data Factory ready to power your data transformation and integration pipelines.
Let’s get started and see how this deployment can be done.
Prerequisites #
Before you start, you’ll need the following to deploy and manage resources with Bicep:
- You need Azure CLI version 2.20.0 or later to deploy Bicep files on your local machine.
- A text editor or IDE of your choice (Visual Studio Code with Bicep extension is my recommendation)
Create the Bicep files #
The first step in deploying a Bicep template is to create the Bicep file that defines your resources. Create a new file named adf.bicep. This file will contain the code needed to define and configure the deployment of your resources.
@description('The name of the Azure Data Factory.')
param factoryName string
@description('The location where the Azure Data Factory will be deployed.')
param location string = resourceGroup().location
@description('The type of identity to be used for the Azure Data Factory.')
@allowed([
'SystemAssigned'
'UserAssigned'
'None'
])
param identityType string = 'SystemAssigned'
@description('The user assigned identities for the Azure Data Factory.')
param userAssignedIdentities object = {}
@description('The encryption settings for the Azure Data Factory.')
param encryption object = {
identity: {
userAssignedIdentity: ''
}
keyName: ''
keyVersion: ''
vaultBaseUrl: ''
}
@description('The global parameters for the Azure Data Factory.')
param globalParameters object = {}
@description('The public network access setting for the Azure Data Factory.')
@allowed([
'Enabled'
'Disabled'
])
param publicNetworkAccess string = 'Enabled'
@description('The Purview configuration for the Azure Data Factory.')
param purviewConfiguration object = {
purviewResourceId: ''
}
@description('The repository configuration for the Azure Data Factory.')
param repoConfiguration object = {
accountName: 'AccountName'
collaborationBranch: 'main'
disablePublish: false
lastCommitId: ''
repositoryName: 'RepositoryName'
rootFolder: '/'
type: 'FactoryVSTSConfiguration'
projectName: 'ProjectName'
}
@description('The tags to be associated with the Azure Data Factory.')
param tags object = {
environment: 'dev'
project: 'ProjectName'
}
@description('The resource ID of the Log Analytics workspace.')
param logAnalyticsWorkspaceId string
resource dataFactory 'Microsoft.DataFactory/factories@2018-06-01' = {
name: factoryName
location: location
identity: {
type: identityType
userAssignedIdentities: identityType == 'UserAssigned' ? userAssignedIdentities : null
}
properties: {
encryption: encryption
globalParameters: globalParameters
publicNetworkAccess: publicNetworkAccess
purviewConfiguration: empty(purviewConfiguration.purviewResourceId) ? null : purviewConfiguration
repoConfiguration: repoConfiguration
}
tags: tags
}
resource diagnosticSettings 'Microsoft.Insights/diagnosticSettings@2021-05-01-preview' = {
name: 'adf-diagnostic-settings'
scope: dataFactory
properties: {
workspaceId: logAnalyticsWorkspaceId
logs: [
{
category: 'PipelineRuns'
enabled: true
retentionPolicy: {
enabled: false
days: 0
}
}
{
category: 'ActivityRuns'
enabled: true
retentionPolicy: {
enabled: false
days: 0
}
}
{
category: 'TriggerRuns'
enabled: true
retentionPolicy: {
enabled: false
days: 0
}
}
{
category: 'SandboxPipelineRuns'
enabled: true
retentionPolicy: {
enabled: false
days: 0
}
}
{
category: 'SandboxActivityRuns'
enabled: true
retentionPolicy: {
enabled: false
days: 0
}
}
{
category: 'SSISPackageEventMessages'
enabled: true
retentionPolicy: {
enabled: false
days: 0
}
}
{
category: 'SSISPackageExecutableStatistics'
enabled: true
retentionPolicy: {
enabled: false
days: 0
}
}
{
category: 'SSISPackageEventMessageContext'
enabled: true
retentionPolicy: {
enabled: false
days: 0
}
}
{
category: 'SSISPackageExecutionComponentPhases'
enabled: true
retentionPolicy: {
enabled: false
days: 0
}
}
{
category: 'SSISPackageExecutionDataStatistics'
enabled: true
retentionPolicy: {
enabled: false
days: 0
}
}
{
category: 'SSISIntegrationRuntimeLogs'
enabled: true
retentionPolicy: {
enabled: false
days: 0
}
}
]
metrics: [
{
category: 'AllMetrics'
enabled: true
retentionPolicy: {
enabled: false
days: 0
}
}
]
}
}
resource dataFactoryDeleteLock 'Microsoft.Authorization/locks@2016-09-01' = {
name: 'dataFactoryDeleteLock'
properties: {
level: 'CanNotDelete'
notes: 'Lock to prevent accidental deletion of the Data Factory'
}
scope: dataFactory
}
output dataFactoryId string = dataFactory.id
output dataFactoryName string = dataFactory.name
output dataFactoryLocation string = dataFactory.location
output dataFactoryIdentityType string = dataFactory.identity.type
output dataFactoryPublicNetworkAccess string = dataFactory.properties.publicNetworkAccess
Breaking Down the Deployment Components #
- Parameters:Defines the essential input values needed to customize the deployment, including the factory name, location, identity type, encryption settings, and network access options. These parameters enhance flexibility and reusability across different environments.
- dataFactory: The core of the deployment, responsible for setting up the Azure Data Factory resource with key configurations for functionality, security, and connectivity.
- diagnosticSettings: Enables logging and monitoring by integrating with Azure Log Analytics, capturing pipeline executions, activity runs, and other operational metrics for centralized analysis.
- dataFactoryDeleteLock:Adds a protective lock to prevent accidental deletion of the Data Factory resource, ensuring operational continuity and avoiding unintended disruptions.
- outputs:Provides useful deployment details such as the Data Factory’s id, name, location, identity type, and network access status, making it easier to validate and integrate the resource post-deployment.
Deployment scope #
You can target your deployment to a resource group, subscription, management group, or tenant. In this case, when creating an Azure Data Factory, a resource group is needed to place all the necessary resources here. By default, when deploying a Bicep template, the scope to which the resource must be deployed is a resource group.
You can use an existing Resource Group, or you can create a new Resource Group. If you want to know how to create a Resource Group using Azure CLI, check out this link.
Deploy the Bicep template using the Azure CLI #
Once your Bicep template is prepared, and you’ve selected your desired scope, you can proceed to deploy the template through the Azure CLI. To do so, execute the following commands.
Parameters #
Personalization is key to making your template reusable. With the parameters, you can easily tailor the template to your specific needs. You can use either inline parameters or a parameter file to pass parameter values. In my case, I will use a file to pass the parameters; here is an example.
using './adf.bicep'
param factoryName = 'ADF-DEMO-WE'
param location = 'westeurope'
param identityType = 'UserAssigned'
param userAssignedIdentities = {
'/subscriptions/<subscription-id>/resourcegroups/<resource-group-name>/providers/Microsoft.ManagedIdentity/userAssignedIdentities/<managed-identity-name>': {}
}
param encryption = {
identity: {
userAssignedIdentity: '/subscriptions/<subscription-id>/resourcegroups/<resource-group-name>/providers/Microsoft.ManagedIdentity/userAssignedIdentities/<managed-identity-name>'
}
keyName: '<key-name>'
keyVersion: '0000000000000000000000000000000'
vaultBaseUrl: '<key-vault-uri>'
}
param globalParameters = {
restServiceUrl: {
type: 'string'
value: 'https://api.ProjectName.com'
}
maxRetryAttempts: {
type: 'int'
value: 5
}
enableLogging: {
type: 'bool'
value: true
}
allowedIPs: {
type: 'array'
value: [
'192.168.1.1'
'192.168.1.2'
]
}
}
param publicNetworkAccess = 'Enabled'
param purviewConfiguration = {
purviewResourceId: '/subscriptions/<subscription-id>/resourceGroups/<resource-group-name>/providers/Microsoft.Purview/accounts/<pureview-account-name>'
}
param repoConfiguration = {
accountName: '<azure-devops-account-name>'
collaborationBranch: 'main'
disablePublish: false
lastCommitId: '0000000000000000000000000000000'
repositoryName: '<repo-name>'
rootFolder: '/'
type: 'FactoryVSTSConfiguration'
projectName: '<projetc-name>'
}
param tags = {
bicep: 'true'
project: 'jorgebernhardt.com'
}
param logAnalyticsWorkspaceId = '/subscriptions/<subscription-id>/resourceGroups/<resource-group-name>/providers/Microsoft.OperationalInsights/workspaces/<log-analytics-wokspace-name>'
Important: Please note that the parameter file stores parameter values in plain text format. If you need to include a parameter with sensitive data, it’s recommended to store the value in a secure key vault.
Preview changes #
Before deploying a Bicep file, you can preview the changes that will occur to your resources. Using what-if operations does not change existing resources; it simply shows you an output that includes color-coded results that allow you to see different changes.
az deployment group what-if \
--resource-group <resource-group-name> \
--name <deployment-name> \
--parameters <filename>.bicepparam
Deploy the Azure resource #
Finally, to deploy the template, run the following command.
az deployment group create \
--resource-group <resource-group-name> \
--name <deployment-name> \
--parameters <filename>.bicepparam
Validate the deployment #
To check if your Azure Data Factroy resource is set up correctly, you can use Azure Portal or Azure CLI. For Azure CLI, run this command to list resources in a specific group and filter for Elastic SAN:
az datafactory show \
--name <data-factory-name> \
--resource-group <resource-group-name>
References and useful links #
Thank you for taking the time to read my post. I sincerely hope that you find it helpful.