Let us first understand about Azure before moving to our Azure Blob and Data Lake Storage plugin and how it benefits our workload automation users.
“Azure is an open and flexible cloud platform that enables you to quickly build, deploy and manage applications across a global network of Microsoft-managed data centres. You can build applications using any language, tool, or framework. You can integrate your public cloud applications with your existing IT environment.”
Azure is incredibly flexible, and allows you to use multiple languages, frameworks, and tools to create the customized applications that you need. As a platform, it also allows you to scale applications up with unlimited servers and storage.
What is an Azure Storage Account?
The Azure Storage platform is Microsoft’s cloud storage solution for modern data storage scenarios. Core storage services offer a massively scalable object store for data objects, disk storage for Azure virtual machines (VMs), a file system service for the cloud, a messaging store for reliable messaging, and a NoSQL store.
An Azure storage account contains all your Azure Storage data objects: blobs, files, queues, tables, and disks. The storage account provides a unique namespace for your Azure Storage data that is accessible from anywhere in the world over HTTP or HTTPS. Data in your Azure storage account is durable and highly available, secure, and massively scalable.
Core storage services
The Azure Storage platform includes the following data services:
- Azure Blobs: A massively scalable object store for text and binary data. Also includes support for big data analytics through Data Lake Storage Gen2.
- Azure Files: Managed file shares for cloud or on-premises deployments.
- Azure Queues: A messaging store for reliable messaging between application components.
- Azure Tables: A NoSQL store for schemeless storage of structured data.
- Azure Disks: Block-level storage volumes for Azure VMs.
Introduction to Azure Data Lake Storage Gen2
Azure Data Lake Storage Gen2 is a set of capabilities dedicated to big data analytics, built on Azure Blob storage.
Data Lake Storage Gen2 converges the capabilities of Azure Data Lake Storage Gen1 with Azure Blob storage. For example, Data Lake Storage Gen2 provides file system semantics, file-level security, and scale. Since these capabilities are built on Blob storage, it provides low-cost, tiered storage, with high availability/disaster recovery capabilities.
Figure 1 Azure Data Lake gen2
Let us clearly understand the benefits with the following example:
Cloud computing has enabled many teams to adopt agile development methods. They need to repeatedly deploy their solutions to the cloud, and know their infrastructure is in a reliable state. As infrastructure has become part of the iterative process, the division between operations and development has disappeared. Teams need to manage infrastructure and application code through a unified process.
To meet these challenges, you can automate upload/download multiples files and use the practice of infrastructure as code.
Using Azure SPN (Service principal Name) credentials or access key user can login and can select the available container in the storage account (Azure).
Instead of using Azure portal, you can upload/download an existing file by using Azure Storage plugin with workload Automation. Using Azure SPN credentials or access key, user can login and can see all the available files in the server (Azure Storage – Data lake gen2).
Let us begin with our plugin part with job definition parameters
Azure Blob and Data Lake Storage Plugin
The Azure Blob and Data Lake Storage is available on Automation Hub, download it to empower your Workload Automation environment.
Log in to the Dynamic Workload Console and open the Workload Designer. Choose to create a new job and select “Azure Blob and Data Lake Storage Plugin” job type in the Cloud section.
Figure 2 Job Definition
Connection Tab
Establishing connection to the Azure server:
Connection Info
Use this section to connect to the Azure server.
Subscription – The ID that uniquely identifies your subscription to Azure. This attribute is required. If not specified in the job definition, it must be supplied in the plug-in properties file.
Client – The Azure Client ID associated to your SPN account. This attribute is required. If not specified in the job definition, it must be supplied in the plug-in properties file.
Tenant – The Azure Tenant ID associated to your SPN account. This attribute is required. If not specified in the job definition, it must be supplied in the plug-in properties file.
Password (Key) – The Azure Client Secret Key associated to your SPN account. This attribute is required. If not specified in the job definition, it must be supplied in the plug-in properties file. This is also known as client key.
Account Name – The account name associated to your Azure Data Storage account.
Test Connection – Click to verify that the connection to the Azure server works correctly.
Figure 3 connection tab – SPN
OR
Access Key Authentication
Account Name – The account name associated to your Azure Data Storage account.
Access Key – Use this option to authorize access to data in your storage account.
Figure 4 Connection Tab – Access key
Action Tab
Use this section to define the operation details.
Operation
Container Name– Specify the name of the container in which the files are stored. Click the Select button to choose the container name defined in the cloud console. Select an item from the list, the selected item is displayed in the Container Name field.
Figure 5 Action Tab – Select Container
Select Operations
-Use this section to either upload or download objects.
Figure 6 Action tab – upload
Upload File – Click this radio button to upload files to the Storage Account.
Folder Location Inside Container– Enter the name of the file to be uploaded or the path of the file stored. Click the Search button to choose the file name defined in the cloud console. Select an item from the list, you can select multiple files. The selected item is displayed in the Folder Location Inside Container field.
Source File Paths – Displays the path of the source file. You can use the filter option to streamline your search.
If a file already exists– Select an appropriate option for the application to perform if the uploaded file already exists in the console.
· Replace – Selecting this option replaces the already existing file in the console.
· Skip – Selecting this option skips the upload of the selected file in the console.
Download File – Click this radio button to download files from the Storage Account.
Figure 7- Action tab – Download
Select Files– Click the Select Files button to choose the file name defined in the cloud.
Destination File Path – Provide the location to download or upload files. Click the Select button to choose the location of the source file, the selected item is displayed in the Destination File Path field.
Submitting your job
It is time to Submit your job into the current plan. You can add your job to the job stream that automates your business process flow. Select the action menu in the top-left corner of the job definition panel and click on Submit Job into Current Plan. A confirmation message is displayed, and you can switch to the Monitoring view to see what is going on.
Figure 8 Submit Job
Are you curious to try out the Azure Data Lake Storage plugin? Download the integrations from the Automation Hub and get started or drop a line at ernesto.carrabba@hcl.com.
Authors Bio
Shubham Chaurasia – Developer at HCLSoftware
Responsible for developing integration plug-ins for Workload Automation. Hands-on with different programming languages and frameworks like JAVA, JPA, Microservices, MySQL, Oracle RDBMS, AngularJS.
LinkedIn – https://www.linkedin.com/in/shubham-chaurasia-1a78b8a9/
Rabic Meeran K, Technical Specialist at HCL Technologies
Responsible for developing integration plug-ins for Workload Automation. Hands-on with different programing languages and frameworks like JAVA, JPA, Spring Boot, Microservices, MySQL, Oracle RDBMS, Ruby on Rails, Jenkins, Docker, AWS, C and C++.
LinkedIn – https://www.linkedin.com/in/rabic-meeran-4a828324/
Saket Saurav, Tester (Senior Engineer) at HCL Technologies
Responsible for performing Automation and Manual Testing for different plugins in Workload Automation using Java Unified Test Automation Framework. Hands-on experience on Java programming language, Web Services with databases like Oracle and SQL Server
LinkedIn – https://www.linkedin.com/in/saket-saurav-8892b546/
Start a Conversation with Us
We’re here to help you find the right solutions and support you in achieving your business goals.