An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud & Edge Systems

Status:: 🟩
Links:: Microservices Reference and Benchmark Applications

Metadata

Authors:: Gan, Yu; Zhang, Yanqi; Cheng, Dailun; Shetty, Ankitha; Rathi, Priyal; Katarki, Nayan; Bruno, Ariana; Hu, Justin; Ritchken, Brian; Jackson, Brendon; Hu, Kelvin; Pancholi, Meghna; He, Yuan; Clancy, Brett; Colen, Chris; Wen, Fukang; Leung, Catherine; Wang, Siyuan; Zaruvinsky, Leon; Espinosa, Mateo; Lin, Rick; Liu, Zhongling; Padilla, Jake; Delimitrou, Christina
Title:: An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud & Edge Systems
Date:: 2019
Publisher:: Association for Computing Machinery
URL:: https://dl.acm.org/doi/10.1145/3297858.3304013
DOI:: 10.1145/3297858.3304013

Notes & Annotations

Color-coded highlighting system used for annotations

πŸ“‘ Annotations (imported on 2024-03-21#18:25:06)

gan.etal.2019.opensourcebenchmarksuite (pg. 2)

The richer the functionality of cloud services becomes, the more the modular design of microservices helps manage system complexity. They similarly facilitate deploying, scaling, and updating individual microservices independently, avoiding long development cycles, and improving elasticity.

gan.etal.2019.opensourcebenchmarksuite (pg. 2)

Even though modularity in cloud services was already part of the Service-Oriented Architecture (SOA) design approach [77], the fine granularity of microservices, and their independent deployment create hardware and software challenges different from those in traditional SOA workloads.

gan.etal.2019.opensourcebenchmarksuite (pg. 2)

The DeathStarBench suite 1 includes six end-toend services that cover a wide spectrum of popular cloud and edge services: a social network, a media service (movie reviewing, renting, streaming), an e-commerce site, a secure banking system, and Swarm; an IoT service for coordination control of drone swarms, with and without a cloud backend.

gan.etal.2019.opensourcebenchmarksuite (pg. 2)

Finally, to track how user requests progress through microservices, we have developed a lightweight and transparent to the user distributed tracing system, similar to Dapper [76] and Zipkin [17] that tracks requests at RPC granularity, associates RPCs belonging to the same end-to-end request, and records traces in a centralized database. We study both traffic generated by real users of the services, and synthetic loads generated by open-loop workload generators.

gan.etal.2019.opensourcebenchmarksuite (image) (pg. 3)

Fig. 3 shows the breakdown of execution time to network (red) and application processing (green) for three monolithic services (NGINX, memcached, MongoDB) and the end-to-end Social Network application

gan.etal.2019.opensourcebenchmarksuite (pg. 3)

Unlike monolithic services though, microservices spend much more time sending and processing network requests over RPCs or other REST APIs.

gan.etal.2019.opensourcebenchmarksuite (pg. 3)

While for the single-tier services only a small amount of time goes towards network processing, when using microservices, this time increases to 36.3% of total execution time, causing the system’s resource bottlenecks to change drastically.

gan.etal.2019.opensourcebenchmarksuite (pg. 3)

Third, microservices significantly complicate cluster management. Even though the cluster manager can scale out individual microservices on-demand instead of the entire monolith, dependencies between microservices introduce backpressure effects and cascading QoS violations that quickly propagate through the system, making performance unpredictable.

gan.etal.2019.opensourcebenchmarksuite (pg. 11)

Finally, the fact that hotspots propagate between tiers means that once microservices experience a QoS violation, they need longer to recover than traditional monolithic applications, even in the presence of autoscaling mechanisms, which most cloud providers employ.

gan.etal.2019.opensourcebenchmarksuite (image) (pg. 11)

gan.etal.2019.opensourcebenchmarksuite (pg. 13)

Finally, we compare the impact of slow servers in clusters of equal size for the monolithic design of Social Network. In this case goodput is higher, even as cluster sizes grow, since a single slow server only affects the instance of the monolith hosted on it, while the other instances operate independently.

gan.etal.2019.opensourcebenchmarksuite (pg. 13)

In general, the more complex an application’s microservices graph, the more impactful slow servers are, as the probability that a service on the critical path will be degraded increases.