Ahmed Y.Hamedand Monagi H.Alkinani
1Faculty of Computers and Information,Department of Computer Science,Sohag University,Sohag,82524,Egypt
2Department of Computer Science and Artifciial Intelligence,College of Computer Science and Engineering,University of Jeddah,Jeddah,21959,Saudi Arabia
Abstract:Task scheduling is the main problem in cloud computing that reduces system performance;it is an important way to arrange user needs and perform multiple goals.Cloud computing is the most popular technology nowadays and has many research potential in various areas like resource allocation,task scheduling,security,privacy,etc.To improve system performance,an efficient task-scheduling algorithm is required.Existing task-scheduling algorithms focus on task-resource requirements,CPU memory,execution time,and execution cost.In this paper, a task scheduling algorithm based on a Genetic Algorithm (GA) has been presented for assigning and executing different tasks.The proposed algorithm aims to minimize both the completion time and execution cost of tasks and maximize resource utilization.We evaluate our algorithm’s performance by applying it to two examples with a different number of tasks and processors.The first example contains ten tasks and four processors; the computation costs are generated randomly.The last example has eight processors,and the number of tasks ranges from twenty to seventy;the computation cost of each task on different processors is generated randomly.The achieved results show that the proposed approach significantly succeeded in finding the optimal solutions for the three objectives;completion time,execution cost,and resource utilization.
Keywords: Cloud computing; task scheduling; genetic algorithm;optimization algorithm
Recently, cloud computing is the most popular technology; resource allocation, task scheduling, security, and privacy have been widely used in various fields.Scheduling plays an important role in improving the efficiency of all cloud-based services.In cloud computing, task scheduling is used to assign the task to the optimal resource for execution.Task scheduling algorithms have different types of algorithms and different issues as completion time, execution cost,complexity, etc.
Cloud computing has emerged as a new computing platform according to the development of virtualization and Internet technologies [1].It can be viewed as a distributed system containing interconnected and virtualized computers that are dynamically provisioned.It maintains the service-level agreements (SLA) between the users and the host applications [2].
Cloud computing is interested in resource management, security, performance, reliability,etc., [3].Resource management is one of the important issues in task scheduling.The task scheduling problem in cloud computing is how to distribute the tasks of users on the available hardware to improve the overall performance of the cloud computing environment [4].
In [5], the authors presented an implementation to the task scheduling using .NET and a GAbased scheduling algorithm to achieve the task and its priority.They grouped the available jobs and executed them using different proposed algorithms.In addition, in [6], a GA was proposed to solve the task scheduling in cloud computing under considering total task completion time,average task completion time, and cost constraint.
The objective of task scheduling in the multiprocessor system is to assign a dependent task to the processors, and the processing time will be reduced.To minimize the processing time,the GA has applied to the processors to obtain various solutions and faster processing time.Task scheduling considers two aspects:the earliest start time (EST) and some task dependencies(NTD).This comparison made by using Java simulation and the result obtained that the proposed algorithm solves minimum EST attains faster processing time than the maximum EST [7].
The task scheduling algorithms using Efficient State Space Search GA (ESSSGA) use the benefits of heuristic-based algorithms to minimize space search and time to obtain effective solutions [8].The task to processor mapping has been made using a heuristic-based earliest finish time approach that reduces the time regarding task execution time.
A new GA for task scheduling in the multiprocessor systems has indicated that task execution priority depends on the height of task graphs to perform scheduling.This method is simulated and used to compare with the basic genetic algorithm [9].The GA efficiency could be attained by the optimization of different parameters like mutation, crossover, selection function, and crossover probability.These GA parameters on the reduction of bi-criteria fitness functions and parameter setting will be accomplished by a central composite design approach with design experiments.The experiments use these parameters and analysis of variance, which reduce the total completion time and makespan [10].
A new GA is used for solving the problems in scheduling task graphs.The algorithm is entirely dependent on the new approach to reduce the communication cost of processors and the length of critical time.In order to solve the scheduling of the task graph, effective GA has been applied.GA proposed for scheduling the task graph that can be acquired is effective in scheduling with low time.The results obtained from the study stated that the algorithm related to graphs without communication cost could act quickly when compared to other MCP algorithms [11].
The GA chromosomes like task list (TL), processor list (PL), and integration of both(TLPLC).The experiments on real-world application graphs like Gaussian elimination, Gauss Jordan and Laplace equation, and LU decompositions.TLPLCGA is related to GA and heuristic algorithms regarding the processor’s time and efficiency conducted.The result experienced was that the hybrid approach performs better than the other algorithms [12].
The effectiveness of Node Duplication GA (NGA) based approach against the existing deterministic scheduling techniques for reducing the interprocessor traffic communication.The results obtained from the simulations indicate that the GA can use the scheduled task to meet deadlines and acquire high processor utilization.Performance analysis of NGA is compared with GA,FCFS, and List Scheduler [13].
An effective method based on GA is created to solve the problem of multiprocessor scheduling.This paper used GA for scheduling precedence task graphs with inter-task communication onto multiprocessors without considering the communication channel.Experimental results show that hard problems have been taken from the internet, illustrates GA with optimization of parameters [14].
The task scheduling problem has been formulated as a multi-objective optimization problem [15,16].In [15], the authors proposed a GA-DE algorithm based on GA and Differential Evolution (DE) to solve the problem under three constraints; total time, cost, and virtual machine load balancing.While [16] developed an EDA-GA hybrid scheduling algorithm based on EDA(estimation of distribution algorithm) and GA to solve the scheduling problem.
The optimal solution to the task scheduling problem cannot be obtained in a limited time and can be found by performing a comprehensive search.So, it is one of the NP-Complete problems [17–20].Therefore, this paper develops a GA-based algorithm to solve the task scheduling problem in the cloud environment.The proposed algorithm’s objective is to allocate and execute dependent tasks in an optimal manner to minimize both the completion time and execution cost and maximize resource utilization.
The rest of this paper is presented as follows:Section 2 discusses problem definition.In Section 3, the operations of the proposed algorithm are illustrated.Our GA approach to finding the optimal task scheduling for a cloud computing system is described in Section 4.Section 5 discusses the results, and in Section 6, conclusions are given.
G A task graph
DAG A Directed Acyclic Graph
tkTask k
Pi Processor i
M
Number of tasks
N Number of processors
ni Node i
ST (ni, p) Start time of node i on a processor p
FT (ni, p) Finish time of node i on a processor p
RT (pi) Ready time of the processor i
Wij Computation cost of task i on the processor j
Cost (Pj) The cost of processor j per second.
BjBusy time of Pj
LT Tasks’List based on DAG order.
DAT (ti, pj) The Data Arrival Time of task i at processor j
CP A critical Path of G
Pc Crossover ratio
Pm Mutation ratio
Pop_size Population size
GN Number of Generations
Maxgn Maximum generation
We denote the task scheduling in the cloud computing as a Graph G (M, E) with M nodes(n1, n2, n3,...,nM).Each node represents a task of G and E directed edges, denoting a partial request of the tasks.The partial request leads to a precedence-constrained (ni→nj), i.e.,niprecedesnjin the execution process.Each node represents an instruction that could be executed along with other instructions sequentially on the same processor; it has one or more inputs.Based on the availability of the inputs, the node (an entry or exit node) is triggered to execute [21].
The execution time of a nodeniis denoted by (ni) weight.If the processor’s processing speed is Psj, then the processing time for tasktion the processor j (Wij) can be calculated by the following equation.We call the processing time the computation cost.
The computation cost of node i on the processor j (Wij) is estimated randomly in the proposed algorithm.
Let C(ni, nj) be the communication cost of an edge (weight of an edge), and it will be equal to zero ifniandnjare processed on the same processor.All the computation and communication costs for a problem are generated randomly in the proposed algorithm.Fig.1 is a form of task scheduling in cloud computing.
Figure 1:The computation and communication costs of DAG
In this paper, the processors in cloud computing are heterogeneous.Therefore, the task’s computation cost varies according to the processor.The start and finish time ofniis denoted byST(ni;pj) andFT(ni;pj), respectively.
The Data Arrival Time (DAT) of tiat processor pjis given by, [21]:
whereN_parentis the number of ti’s parents andC(ti.tk)=0; iftiandtkare scheduled on the same processor.
The task scheduling problem in cloud computing can be defined as; Find the best assignment of the start times of the given tasks on processors such that the schedule length (the completion time) and execution cost are minimized with the condition that precedence-constrained is preserved.
The completion time is defined as the schedule length or finish time and is computed by:
where,
The following pseudo-code shows how to find the schedule length (denoted byS_Length)using SGA, [21]:
The total cost (Executin Cost) of all tasks on the available processors is calculated by:
The utilization of resources is given by dividing the total value ofBjover the completion time of an application.As follows, [22]:
That is, the objective is to minimize Eqs.(3), (5) and (6).
The following subsections investigate the different components of the proposed GA, encoding,initialization, objective function, crossover, and mutation operations.The GA is terminated when the best solution found, or the number of generations exceeds the Maxgn.
In the proposed GA, if we have M tasks and N processors, the chromosome is divided into two parts; distributing and scheduling parts.The distributing part represents the processor’s indices, and the scheduling part shows the tasks to be processed, as shown in Fig.2.According to Fig.2, the processor P1processes the tasks t1, t3, while t4and t6will be processed by P2,...etc.The length of the chromosome is linearly proportional to the number of tasks.
Figure 2:Tasks representation on processors
The initial population is generated randomly and according to the following steps:
(1) A chromosome X is generated, as shown in Fig.2.
(2) The first part of X is generated randomly from 1 to N.
(3) The second part is generated randomly from 1 to M taking into account the precedenceconstrained.
(4) Repeat from 1 to 3 to generate the number of chromosomes (population size).
This paper’s main objective is to map all the tasks to all the processors, minimize the completion time, execution cost, and maximize resource utilization.Therefore, the fitness function (Fit) of the candidate solution is the minimum value of the completion time.i.e.,Fit=Min(Completion Time).
4.4.1 The Crossover Operation
In the crossover, we use a 1-point crossover to produce one child from two selected parents based on the Pc value.The distributing part of the child is taken from the distributing part of the first parent, and the scheduling part of it is taken from the second parent.Fig.3 explains the crossover operation:
Figure 3:The crossover operation
4.4.2 The Mutation Operation
The mutation operation is performed on the distributing part of the selected parent based on the Pm value.The position to be mutated is selected randomly to change its value as shown in Fig.4.
Figure 4:The mutation operation
The following algorithm explains how to use the different components of the proposed GA as described in Section 3 to solve the task scheduling problem.
?
?
In this section, the proposed GA has been applied to two examples.The values of pop_size,Pc, and Pmare 20, 0.95, and 0.02, respectively.
In this example, the number of M is 10 tasks, and N is 4 processors.The communication cost between the tasks and the computation cost of each task on different processors are generated randomly from 1 to 20, and 1 to 5, respectively.The communication cost and the computation cost are shown in Tabs.1 and 2, respectively.
Table 1:The communication cost between the tasks
Table 2:The computation cost of each task on different processors
The cost of different processors per second is generated randomly from 1 to 10 and is shown in Tab.3.
Table 3:The cost of different processors per second
The best solution obtained by the proposed genetic algorithm is shown in Fig.5.
Figure 5:The best solution
The task scheduling on the different processors is shown in Tab.4 and Fig.6.
Table 4:The task scheduling on the different processors
Figure 6:The task scheduling on the different processors
The busy time of the processors is shown in Tab.5 and Fig.7.
Table 5:The busy time for each processor
Figure 7:The busy time for each processor
The available time of the processors is shown in Tab.6 and Fig.8.
Table 6:The available time of the processors
Figure 8:The available time of the processors
The completion time, execution cost, utilization, speedup, and efficiency are shown in the following table, Tab.7.
Table 7:The completion time, execution cost, utilization, speedup, and efficiency
In this example, consider four cases with N = 8 processors.The number of tasks is:20, 30,40, 50, and 70 tasks in the first, second, third, and fourth case (M = 20, 30, 40, 50, and 50).The communication cost between the tasks and the computation cost of each task on different processors are generated randomly from 1 to 20, and 1 to 5, respectively.
The completion time, execution cost, utilization, speedup, and efficiency are shown in the following table, Tab.8 and Figs.9–11.
Table 8:The completion time, execution cost, utilization, speedup, and efficiency
Figure 9:The completion time of the problems
Figure 11:The resource utilization of the problems
The proposed GA has successfully solved task scheduling problem in Cloud computing in this paper.The proposed algorithm targets to minimize completion time, execution cost and maximize resource utilization.The completeness and correctness of the proposed algorithm have been tested.This has proven that our proposed technique enabled us to obtain results faster, leading to saving time and effort.In other words, the use of the proposed genetic algorithm has played a major role in reducing the search space generated by the problem.
In summary, our experimental results indicate that the algorithm is more efficient than other heuristics.To the best of our knowledge, our method’s structure and design are designed for the task scheduling problem in the cloud computing environment.This has made it very hard to find common features with other previous methods for comparison reasons.
Acknowledgement:The authors thank the anonymous referees for their careful readings and provisions of helpful suggestions to improve the presentation.
Funding Statement:The authors received no specific funding for this study.
Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.
Computers Materials&Continua2021年12期