3 things you can do to get started with Azure Batch for Big Compute
- Posted on April 9, 2019
- Estimated reading time 3 minutes
My coffee machine and coffee cup are both connected to the internet, as well as my home beer brew appliance and many other devices that I use throughout the day. Having these devices internet-connected allows me plan and navigate my activities more efficiently. From the time we wake until we go to bed, and even while sleeping, internet-connected devices and applications are collecting data about us. We are talking about massive amounts of data collected every day and with all of this data comes the question of how to analyze it. Azure Batch is the answer.
How much data are we talking about?
Have you ever thought about how much data is being gathered and analyzed by the companies who make those products? Every minute of every day Instagram users upload 49,380 photos, Skype makes 176,220 calls, Netflix streams 97,222 hours of video and there are 3.8 million Google searches performed (view the “Data Never Sleeps” infographic by Domo for yourself). Companies are looking to better understand their customer needs, establish a competitive edge, and develop new innovative products. Meeting those needs requires applications designed to process big data along with faster compute to output information in a timely manner. This is where Big Compute enters the game using Azure Batch. Batch is an Azure tool used to create and manage compute nodes to support large-scale parallel high-performance batch jobs.
So what is Big Compute?
Simply put, it’s the ability to have multiple compute cores deployed in a cluster/grid that work together and can auto-scale the number of cores as workloads increase or decrease. You will find Big Compute called; high performance computing (HPC), parallel/grid computing, or batch computing and you will find all of them used interchangeably.
What does this look like in practice?
Let’s say I have 1,000 video files that I need to compress into smaller files before they can be posted to an internet site for public use. I can write a compression script for 1 VM with 4 CPU cores to process 1 file per core (4 files at a time), and if it takes 5 minutes per file to process the job, it would take approximately 20 hours to complete. This is where the fun and magic happen when you use Azure Batch and Big Compute. If I need the job to complete in 15 minutes, with Azure Batch I could deploy 333 CPU cores (1000 files x 5 minutes = 5000 minutes/15 minutes = 333 core) in a grid and distribute the files across all cores and the job would complete in 15 minutes.
Want more fun? If I need it to complete in a minute or less, I would deploy 1000 cores. There is a bit more complexity that goes into determining what processing performance and memory is required, or if its CPU or GPU dependent, but the point is with Azure Batch we can build grid clusters to meet the compute needs. There are VM’s specifically designed for high compute or graphic intensive workloads such as H-Series (CPU) and N-Series (GPU), but you can use other less expensive VM ‘s using Batch.
How and where is Big Compute being used?
While there are a variety of use cases for Big Compute, two that standout are analytics and HPC workloads. Let’s take a look at these two:
1) Analytics – This goes back to the introduction where all that data is being collected by apps and IoT devices and needs to be processed and analyzed. It’s this space where Big Compute will become the next generation of application and compute to help companies better understand their customer needs, establish a competitive edge, and develop new innovative products.
2) HPC workloads – HPC platforms are commonly used in industries such as: Automotive (autonomous driving), Oil and Gas (discovery and mapping ocean floors), Financial Services (Monte Carlo risk simulations), Manufacturing (simulations for predictive performance) and Insurance (risk models).
How can you get started with Azure Batch for Big Compute?
Big Compute is relatively new to most people, but as you get to understand the power of being able to process massive amounts of data collected across all those devices, you will come to understand Big Compute is becoming the next generation of computing. You will also recognize that to be competitive, companies will need to scale and access more and faster compute power on demand, which can only be achieved by leveraging cloud services.
Here are 3 things you can do to get started with Azure Batch for Big Compute:
- Get to know how companies can leverage Big Compute on Azure.
- Become familiar with how applications leverage Azure Batch for Big Compute workloads.
- Get to know Azure Batch services and the required cloud surround foundational infrastructure designs to support Big Compute integration.
Azure can help companies make the move from ridged and expensive capital investments to scalable, manageable and business driven solutions in the cloud.