To install StudyMoose App tap and then “Add to Home Screen”
Save to my list
Remove from my list
The concept of elasticity in computing is important in load-balancing and overall efficiency, availability, and scalability of resources. Systems in industries like healthcare, finance and technology depend on elasticity to perform advanced analytics on their data. This article reviews the importance of elasticity in cloud computing as it relates to the usage of virtualization through virtual machines and containers. We comprehensively survey other works which use virtualization methods for stream processing and cloud computing. Lastly, we discuss challenges in using virtualization methods like availability, security, and performance and summarize our findings.
The need to analyze high-volume data streams in real-time is important for industries like healthcare, finance, and technology.
These data streams can be analyzed to solve problems like epidemic prediction, fraud detection, and network monitoring. In these situations, real-time analytics allows for the prevention of catastrophes like disease spread, fraudulent charges, and network hacking. In all industries, real-time analytics serves a purpose in improving service quality and ultimately reducing costs.
Other benefits of using data stream processing include limited I/O operations, increased scalability, and faster response time.
Stream processing is necessary for real-time analytics, and works by processing data over a rolling time period. This type of processing can be compared to batch processing, which occurs over blocks of data that are stored over time. For example, a company may store financial data hourly and perform a batch analysis on that data at the end of the day. A corresponding data stream processor would use the last few transactions on a client’s account to perform an analysis.
As such, batch processing analyzes a larger set of data in one analysis than would a stream process that analyzes a micro-batch of data. The latency for batch processes can be minutes to hours, while the latency for stream processing can be milliseconds to seconds. Since batch processing has longer latency, it can be used for more complex analytics than stream processing
Virtualization techniques were born of necessity once servers became more powerful in their processing power and capacity. VMs and containers are similar in that they are used to maximize the resources of the native system they are running on.
Although the virtualization of multiple systems is a good option for scenarios where systems are limited, VMs and containers each have their respective advantages and disadvantages. While VMs each have their own OS and virtualized hardware, containers only virtualize the OS. As such, VMs can consume larger amounts of RAM and CPU clock cycles, while a container’s use of only the native OS’s bins, libraries, and system resources makes it more lightweight in comparison. It also follows that containers must have the same OS as the host system, while VMs can have an entirely different OS than the host. In using VMs, a hypervisor is used to provide an abstraction between the VM and the hardware. Some examples of hypervisors include, KVM, Citrix Xen, and Microsoft Hyper-V. Containers are created, deployed, and scaled using container managers like Swarm and Kubernetes. The use of virtual machines and containers to achieve resource control can be compared with respect to throughput, latency, and various other measures of performance.
The use of virtualization in cloud computing has allowed for a greater amount of elasticity as resources are load-balanced in real-time. We define two types of elasticity: vertical elasticity and horizontal elasticity. Vertical elasticity is the scaling of resources like CPU, cores, memory, and network. Horizontal elasticity is the addition or removal of computing resources, like increasing the number of servers. In cloud-computing, the concept of elasticity can be expressed in the following equation: Elasticity = scalability + automation + optimization Mechanisms of elasticity can be classified into seven types. The first is configuration, which is the assignment of resources with a cloud provider. The resources can either have a fixed or configurable assignment. For example, VMs may have a fixed assignment of CPU and memory. The second mechanism is scope, which occurs at the platform and infrastructure levels. At the platform level, multi-tiered applications like Google App Engine are used. The infrastructure level is based on virtualization using VMs or containers. The third mechanism is purpose, like reducing cost and energy, increasing performance and resource capability, and guaranteeing availability.
The fourth mechanism is mode (policy), and comprises the interactions necessary to perform an elastic operation. Elastic operations are performed in automatic mode, meaning the system is triggered by actions and load-balances according to those actions. The fifth mechanism is method, which encompasses horizontal and vertical scaling. The sixth mechanism is architecture, divided into centralized architecture, where there is only one elasticity controller, and decentralized architecture, where there are multiple elasticity controllers. The last mechanism is provider, meaning a solution can have a single or multiple cloud providers.
This paper discusses the need for high-volume stream processing and its relation to virtualization through virtual machines and containers. We define relevant terms like virtual machines, containers, and elasticity, and discuss their importance in this paper. We then summarize methodologies for deploying virtual machines and containers used in related works. Lastly, we discuss the motivation and challenges in using virtualization techniques and summarize our findings.
Researchers propose a two stage software platform and architectural framework called Foggy. Foggy utilizes a 3+ tiered architecture to orchestrate cloud computing over an expanded network of edge ‘cloudlets’ and ‘edge gateways’. This approach uses OpenStack as an Iaas layer and sits on top of a Kubernetes cluster. The approach is tested using 2 forms of a stream-based face-detection software deployed strictly across a decentralized network of clouds, ‘cloudlets’, and ‘edge gateways’. The first test involves the streaming of large photo files to the processing cloudlets, whereas the second test uses edge gateways to process photos and streams only the important image fragments to the cloudlets.
Another study tests the claims of high availability of the default configuration of Kubernetes. The 99. 999% uptime requirement is shown to be difficult to attain with Kubernetes’ default configurations as the reaction and repair times reach upwards of 5 minutes upon an unplanned pod termination. This time is expected to decrease with custom installations of Kubernetes which periodically execute graceful terminations of pods to maintain the clusters overall health. In order to accurately compare performance, both graceful and ungraceful pod terminations are performed and monitored.
One more study introduces the idea of a duo: the minimal building block of an elastic stream cloud system, which can play any one of three roles. A duo with a distribution role handles the basic acceptance and encoding of data streams while one with a utility role provides an accessible API for third party processing. These duos are managed by duos with organization roles on a local and global scale. By defining these simple roles, duos can more readily fit into existing cloud infrastructures when processing demand increases.
Research approaches the VM vs. Containers debate from a cloud gaming perspective in which cloud systems are tested for their ability to process graphics data, encode the resulting frames, and stream them back to a consumer. Benchmarks show that containerized cloud systems outperform virtual machine configurations in GPU-bound calculations, video encoding speeds and startup and restore times; all while managing RAM and CPU resources more efficiently. The containerized implementations were steadily within 1% of native system performance and are expected to become more available as Docker container support is expanded in future releases.
Another study utilizes Ubernetes (a federation of Kubernetes clusters) to show how containerized cloud implementations have the potential to self-manage energy resources in addition to memory and processing resources. In an experiment, by establishing a multi-region cloud computing network through Ubernetes powered by photovoltaic solar, wind, and grid energy, an overall decrease in grid reliability of 22% was achieved. The experiment demonstrated the ability of containerized infrastructures to gracefully migrate between green and grid energies.
One more research introduces HarmonicIO, a cloud framework aimed at solving the many issues that come with processing large scientific datasets. HarmonicIO specializes in the stream processing of large data objects upwards of 100 GBs by utilizing a P2P network of Docker containers paired with a message queue to help avoid stream bottlenecking. The framework boasts high throughput through data-parallelism, the natural elasticity of Docker containers, and simple extensibility.
High availability: Issues can arise depending on the failure recovery practices of the infrastructure. The strict definition for true high availability leaves a margin of error of only 5 minutes of downtime for every year of proper functionality. Even Kubernetes, which boasts high availability as one of its selling points, has been shown to experience issues maintaining high availability when operating under default configurations. Upon ungraceful termination, a Kubernetes cluster can take upwards of 5 minutes to notice and recover from a single pod failure, meaning that a single pod failure is all that can be allowed per year of runtime. These issues that can cause the cloud infrastructure to experience sub-par availability can potentially be avoided with customized installations that include periodic graceful pod termination and instantiation.
Startup Time: Virtual Machine implementations suffer from high startup times due to the fact that every VM must start up its own instance of an operating system before it can begin processing information. This startup delay is extended when host machines have limited resources split between multiple VMs, and can reach levels deemed unacceptable by many end users. As the number of VMs on a single host machine increases, the static resources assignable to each VM decreases and causes each startup to take longer. Startup time is also an issue present in containerized approaches when a container fails and must be replaced. The time between when a container fails and when a replacement container starts up and begins serving requests is vital for the overall performance of a stream processing system.
Resource availability: Cloud providers cannot offer unlimited resources, and must offer resources at certain price points depending on profitability, availability, competition, and other limiting factors such as energy consumption and green energy taxes. Consumers of these cloud systems also deal with limitations such as latency due to geographical location, budgeting restrictions, and infrastructure requirements. While VM and container infrastructures offer the ability to dynamically spin up new workstations in order to maintain performance as processing needs fluctuate, resources may become strained as they are shared amongst VMs or containers on a single host machine.
Security costs: Concerns arise when containers are not set up with security in mind. Due to their nature, if the host machine is compromised, all containers running on the host machine become directly vulnerable to whatever has compromised the host machine or the neighboring container. This vulnerability could potentially bring down whole container clusters if not properly set up, which would lead to an availability nightmare. Virtual Machines are more secure in this respect as they run processes in their own confined environments and do not leave neighboring VMs at risk if they become compromised, but they achieve this security at the cost of longer startup times.
The Role Of Elasticity in Cloud Computing in Terms Of Virtualization. (2024, Feb 17). Retrieved from https://studymoose.com/the-role-of-elasticity-in-cloud-computing-in-terms-of-virtualization-essay
👋 Hi! I’m your smart assistant Amy!
Don’t know where to start? Type your requirements and I’ll connect you to an academic expert within 3 minutes.
get help with your assignment