Introduction
Cloud computing is an emerging technology composed of several key components that work together to create a seamless network of interconnected devices. These interconnected devices, such as sensors, routers, smartphones, and smart appliances, are the foundation of the Internet of Everything (IoE). Huge volumes of data generated by IoE devices are processed and accumulated in the cloud, allowing for real-time analysis and insights. As a result, there is a dire need for load-balancing and task-scheduling techniques in cloud computing.
The primary objective of these techniques is to divide the workload evenly across all available resources and handle other issues like reducing execution time and response time, increasing throughput and fault detection. This systematic literature review (SLR) aims to analyze various technologies comprising optimization and machine learning algorithms used for load balancing and task-scheduling problems in a cloud computing environment.
To analyze the load-balancing patterns and task-scheduling techniques, we opted for a representative set of 63 research articles written in English from 2014 to 2024 that has been selected using suitable exclusion-inclusion criteria. The SLR aims to minimize bias and increase objectivity by designing research questions about the topic. We have focused on the technologies used, the merits-demerits of diverse technologies, gaps within the research, insights into tools, forthcoming opportunities, performance metrics, and an in-depth investigation into ML-based optimization techniques.
Cloud Computing Architecture and Fog Computing
The surge in IoT device usage has led to the emergence of cloud computing as a significant research focus. It offers a variety of services in many different application areas, with the highest level of flexibility and scalability. The high growth of information and communication technologies (ICT) has resulted in integrating big data with the IoT, revolutionizing cloud services.
Within this transformative framework, cloud computing is pivotal in enabling efficient and scalable solutions for managing big data. Numerous cloud service providers enable organizations to obtain the optimal software, storage, and hardware facilities needed to accomplish their goals at a much more affordable cost. Customers subscribe to the services they require under the cloud computing paradigm and sign a service level agreement (SLA) with the cloud vendor, outlining the quality of service (QoS) and conditions of service provision.
Table 1: Service control that the various cloud service models offer to end-users
Cloud Service Model | Service Control |
---|---|
Infrastructure as a Service (IaaS) | Highest level of control over infrastructure, including virtual machines, storage, and networking. |
Platform as a Service (PaaS) | Control over applications and data, with limited control over the underlying infrastructure. |
Software as a Service (SaaS) | Least control, as the service provider manages the infrastructure, platform, and software. |
Load balancing is a method that distributes tasks among virtual machines (VMs) using a Virtual Machine Manager (VMM). It assists in handling different types of workloads, such as CPU, network, and memory demands (Buyya 2018) (Mishra and Majhi 2020).
The cloud computing infrastructure has three significant challenges: virtualization, distributed frameworks, and load balancing. The load-balancing problem is defined as the allocation of workloads among the processing modules. In a multi-node environment, it is quite probable that certain nodes will experience excessive workload while others will remain inactive. Load unbalancing is a harmful event for cloud service providers (CSPs), as it diminishes the dependability and effectiveness of computing services while also putting at risk the quality of service (QoS) guaranteed in the service level agreement (SLA) between the customer and the cloud service provider (Oduwole et al. 2022).
Verma et al. (2024) introduced a load-balancing methodology, utilizing genetic algorithms (GA), to improve the quality of the telemedicine industry by efficiently adapting to changing workloads and network conditions at the fog level. The flexibility to adapt can enhance patient care and provide scalability for future healthcare systems.
Walia et al. (2023) cover several emerging technologies in their survey, including Software-Defined Networking (SDN), Blockchain, Digital Twins, Industrial IoT (IIoT), 5G, Serverless computing, and quantum computing. These technologies can be incorporated with the current fog/edge-of-things models for improved analysis and provide business intelligence for IoT platforms. Adaptive resource management strategies are necessary for efficient scheduling and decision-offloading due to the infrastructural efficiency of these computing paradigms.
Intelligent Computing Resource Management (ICRM) is rapidly evolving to meet the increasing needs of businesses and sectors, driven by the proliferation of Internet-based technologies, cloud computing, and cyber-physical systems. With the rise of information-intensive applications, artificial intelligence, cloud computing, and IoT, intelligent computing monitoring and resource allocation have become crucial (Biswas et al. 2024).
Cloud data centers typically need to be optimized because they are built to handle hundreds of loads, which could result in low resource utilization and energy waste. The goals of load balancing include reduced job execution times, optimal resource utilization, and high system throughput. Load balancing reduces the overall resource waiting time and avoids resource overload (Apat et al. 2023).
In terms of the equilibrium load distribution, load balancing between virtual machines (VMs) is an NP-hard problem. The difficulty of this problem can be determined by taking two elements into account: huge solution spaces and polynomial-bounded computing. The load can be characterized as under-load, overloaded, or balanced in a cloud computing environment. Identifying overloaded and under-loaded nodes and then distributing the load across them is critical to load balancing (Santhanakrishnan and Valarmathi 2022).
With the emergence of technology, many challenges have also ushered in a sequence. These challenges include storage capacity, high processing speed, low latency, fast transmission, load balancing, efficient routing, cost efficiency, etc. Load balancing is a crucial optimisation procedure in cloud computing, and achieving this objective depends on dynamic resource allocation. Some factors that affect load balancing in cloud computing are as follows:
- Workload patterns: The variating workload, unpredictable traffic patterns, and heterogeneous applications may affect the efficiency of the cloud system.
- Geographical distribution: The cloud data centres are generally located in remote areas that contribute to transmission delays. So, fog computing and edge computing are required to reduce these delays. We must efficiently manage the limited resources of the fog and edge devices.
- Cost and budget constraints: Cost considerations have a big impact on load-balancing strategies. It frequently aims to use less expensive resources or minimize idle assets.
- SLA agreements and breaches: SLA violations are impacted by the services offered by cloud service providers. It is quite necessary to maintain the quality without compromising other factors like throughput, makespan, energy consumption, and cost.
- Virtual Machine (VM) Migrations: An increase in the number of VM migrations leads to a decrease in service quality. While VM migration can be beneficial to some extent, its frequency can lead to an increase in time complexity. It takes a lot of time to transfer data from one VM to another, including copying memory pages to the host machine.
- Resource availability: Insufficient resources, such as CPU, memory, or bandwidth, limit the load balancing efficiency. Energy consumption is a critical factor in data centers. Load balancing is very necessary to reduce energy consumption by migrating VMs from overloaded resources to underloaded hosts.
- Other factors: Fault tolerance, predictive analytics, network latency and data security also affect load balancing in a cloud system.
Load Balancing Techniques in Cloud Computing
We have divided the technologies reviewed through this SLR into five categories: conventional/traditional, heuristic, meta-heuristic, ML-Centric and Hybrid.
Traditional approaches to cloud computing resource allocation and load balancing are time-consuming, unable to yield fast results, and frequently trapped in local optima (Mousavi et al. 2018). In different cloud systems, where resource requirements are estimated at runtime, static load balancing algorithms might not be successful. Dynamic load balancing algorithms, like ESCE and Throttled mechanism, analyse resource requirements and usage during runtime, yet they may result in extra costs and overhead.
Traditional algorithms often struggle to scale with the size and complexity of problems. Several articles explore traditional task scheduling algorithms, including Min-min, First come-first serve (FCFS), and Shortest-job-first (SJF). These algorithms are not used often due to their slow processing and time-consuming behaviour.
To overcome the issue of conventional methods, a heuristic approach came into the area of research. Kumar and Sharma (2018) propose a resource provisioning and de-provisioning algorithm that outperforms FCFS, SJF, and Min-min in terms of makespan time and task acceptance ratio. However, the priority of tasks is poorly considered, highlighting a limitation in task allocation strategies.
Heuristic algorithms demonstrate remarkable scalability. They are highly suitable for handling large-scale optimisation challenges in various industries, including manufacturing, banking, and logistics, due to their efficiency in locating approximate solutions, even in enormous search spaces (Mishra and Majhi 2020).
Kumar et al. (2018) presented another heuristic method named ‘Dynamic Load Balancing Algorithm with Elasticity’, showcasing reduced makespan time and increased task completion ratio. Dubey et al. (2018) introduced a Modified Heterogeneous Earliest Finish Time (HEFT) algorithm, demonstrating improved server workload distribution to reduce makespan time. While promising, both studies lack comprehensive performance evaluations and limitedly address other Quality of Service (QoS) metrics, such as response time and cost efficiency.
Hung et al. (2019) proposed an Improved Max–min algorithm, achieving the lowest completion and optimal response times. It outperformed the conventional RR, max–min and min-min algorithms.
The development of meta-heuristic algorithms aimed to address the shortcomings of heuristic algorithms, which typically produce approximate rather than ideal solutions. Hybrid techniques have gained traction in recent years, combining heuristic, traditional, and machine-learning approaches.
Mousavi et al. (2018) propose a hybrid technique combining Teaching Learning-Based Optimization (TLBO) and Grey Wolf Optimization (GWO), achieving maximized throughput without falling into local optima. Similarly, Behera and Sobhanayak (2024) propose a hybrid GWO-GA algorithm, outperforming GWO, GA (Rekha and Dakshayini 2019), and PSO in terms of makespan, cost, and energy consumption.
Further, we have also discussed the cloud and fog architecture and its working principles in the upcoming sections.
Cloud and Fog Computing Architecture
The Industrial Internet of Things (IIoT) has experienced significant advancement and implementation due to the quick progress and use of artificial intelligence techniques. In Industry 5.0, the hyper-automation process involves the deployment of intelligent devices connected to the Industrial Internet of Things (IIoT), cloud computing, smart robots, agile software, and embedded components. These systems can leverage the Industry 5.0 concept, which generates massive amounts of data for hyper-automated communication across cloud computing, digital transformation, human sectors, intelligent robots, and industrial production. Big data management requires cloud and fog technology (Souri et al. 2024).
Similarly, telemedicine, facilitated by fog computing, has revolutionized the healthcare industry by providing remote access to medical treatments. However, ensuring minimal latency and effective resource utilization are essential for providing high-quality healthcare (Verma et al. 2024).
Big data in the industrial sector is crucial for predictive maintenance, enabling informed decisions and enhancing task allocation in Industry 4.0, thus necessitating a proficient resource management system (Teoh et al. 2023).
The growing demand for load balancing in various industries using cloud/fog services prompted us to contemplate and inspired us to compose an evaluation of the escalating necessity for resource management technologies.
Figure 2: Collaboration between cloud, fog, and IoT layers
Cloud computing facilitates virtualization technology, which combines distributed and parallel processes. Using centralized data centers, it transfers computations from off-premises to on-premises. It has become an advanced technology within the swiftly expanding realm of computing paradigms owing to these two principles: (1) ‘Dynamic Provisioning’ and (2) ‘Virtualization Technology’ (Tripathy et al. 2023).
Dynamic provisioning is a fundamental concept in the realm of cloud computing. It refers to the automated process of allocating and adjusting computing resources to meet the changing needs of cloud-based applications and services. Virtual network embedding is essential to load balancing in cloud computing as it ensures the mapping of virtual network requests onto physical resources in an effective and balanced manner. By effectively embedding virtual networks onto physical machines, load-balancing algorithms can divide network traffic and workload evenly across the network infrastructure, preventing any single resource from becoming overloaded.
Fog computing, on the other hand, acts as an arbitrator between end devices and Cloud Computing, providing storage, networking, and computation services closer to edge devices. The introduction of Edge Computing has brought about the emergence of various computing paradigms, such as Mobile Edge Computing (MEC) and Mobile Cloud Computing (MCC). The MEC primarily emphasizes a 2- or 3-tier application in the network and mobile devices equipped with contemporary cellular base stations. It improves the efficiency of networks by optimizing content distribution and facilitating the creation of applications (Sabireen and Neelanarayanan 2021).
Table 3: Comparison between the features of cloud and fog computing paradigms
Feature | Cloud Computing | Fog Computing |
---|---|---|
Proximity to End Devices | Distant | Proximate |
Latency | High | Low |
Mobility Support | Limited | High |
Scalability | High | Moderate |
Geographical Distribution | Centralized | Distributed |
Resource Heterogeneity | Homogeneous | Heterogeneous |
Network Bandwidth | High | Limited |
Cloud-fog architecture finds applications in various domains, including IoT, healthcare (Alatoun et al. 2022), transportation, smart cities, and industrial automation (Dogo et al. 2019). Healthcare providers can leverage fog nodes for real-time patient monitoring, while industrial automation systems can benefit from edge analytics for predictive maintenance.
Load Balancing and Task Scheduling Techniques in Cloud Computing
Before 2014, traditional methods such as FCFS, SJF, MIM-min, Max–min, RR, etc., were recognized for their poor processing speeds and time-consuming job scheduling and load balancing systems.
Konjaang et al. (2018) examine the difficulties associated with the conventional Max–Min algorithm and propose the Expa-Max–Min method as a possible solution. The algorithm prioritizes cloudlets with the longest and shortest execution times to schedule them efficiently.
In 2019, Hung et al. (2019) introduced an enhanced max–min algorithm called MMSIA. The objective of the MMSIA algorithm is to improve the completion time in cloud computing by utilizing machine learning to cluster requests and optimize the utilization of virtual machines. The system allocates big requests to virtual machines (VMs) with the lowest utilization percentage, improving processing efficiency.
Kumar et al. (2018) state that the updated HEFT algorithm creates a Directed Acyclic Graph (DAG) for all jobs submitted to the cloud. It also assigns computation costs and communication edges across processing resources. The ordering of tasks is determined by their execution priority, which considers the average time it takes to complete each work on all processors and the expenses associated with communication between predecessor tasks.
Subsequently, the tasks are organized in a list according to their decreasing priority and assigned to processors based on the shortest execution time. In the same way, Seth and Singh (2019) propose the Dynamic Heterogeneous Shortest Job First (DHSJF) model as a solution for work scheduling in cloud computing systems with varying capabilities.
Another technique that many authors increasingly employ is GWO. The GWO technique correlates the duties of grey wolves with viable solutions for distributing jobs or equalizing workloads inside a network or computing system. The Alpha wolves lead the pack, representing the most optimal solution achieved up to this point. The Alpha receives assistance in decision-making and problem-solving from the Beta and Delta wolves, who represent the second and third most optimal alternatives, respectively. The omega wolves, who stand for the remaining solutions, are inspired by the top three wolves.
Farrag et al. (2020) published a work that examines the application of the Ant-Lion optimizer (ALO) and Grey wolf optimizer (GWO) in job scheduling for Cloud Computing. The objective of ALO and GWO is to optimize the makespan of tasks in cloud systems by effectively dividing the workload. Reddy et al. (2022) introduced the AVS-PGWO-RDA scheme, which utilizes Probabilistic Grey Wolf optimization (PGWO) in the load balancer unit to find the ideal fitness value for selecting user tasks and allocating resources for tasks with lower complexity and time consumption.
Similarly, Janakiraman and Priya (2023) introduced the Hybrid Grey Wolf and Improved Particle Swarm Optimization Algorithm with Adaptive Inertial Weight-based multi-dimensional Learning Strategy (HGWIPSOA). This algorithm combines the Grey Wolf Optimization Algorithm (GWOA) with Particle Swarm Optimization (PSO) to efficiently assign tasks to Virtual Machines (VMs) and improve the accuracy and speed of task scheduling and resource allocation in cloud environments.
At the beginning