Stepwise migration of a monolith to a microservice architecture: Performance and migration effort evaluation

Status:: 🟩
Links:: Modular Monolith, Microservices vs. Monolith

Metadata

Authors:: Faustino, Diogo; Gonçalves, Nuno; Portela, Manuel; Rito Silva, António
Title:: Stepwise migration of a monolith to a microservice architecture: Performance and migration effort evaluation
Publication Title:: "Performance Evaluation"
Date:: 2024
URL:: https://www.sciencedirect.com/science/article/pii/S0166531624000166
DOI:: 10.1016/j.peva.2024.102411

Notes & Annotations

Color-coded highlighting system used for annotations

📑 Annotations (imported on 2024-03-23#11:45:46)

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (pg. 1)

Research has been done on the comparison of the performance quality between a monolith system and its correspondent implementation using a microservice architecture, but these results are sometimes contradictory, e.g. [4,5], address different characteristics of a microservices system, e.g. [6,7], or are evaluated using simple systems, e.g. [8].

[4] M. Villamizar, O. Garcés, H. Castro, M. Verano, L. Salamanca, R. Casallas, S. Gil, Evaluating the monolithic and the microservice architecture pattern to deploy web applications in the cloud, in: 2015 10th Computing Colombian Conference, 10CCC, 2015, pp. 583–590, http://dx.doi.org/10.1109/ColumbianCC. 2015.7333476.
[5] T. Ueda, T. Nakaike, M. Ohara, Workload characterization for microservices, in: 2016 IEEE International Symposium on Workload Characterization, IISWC, 2016, pp. 1–10, http://dx.doi.org/10.1109/IISWC.2016.7581269.
[6] A.M. Joy, Performance comparison between linux containers and virtual machines, in: 2015 International Conference on Advances in Computer Engineering and Applications, 2015, pp. 342–346, http://dx.doi.org/10.1109/ICACEA.2015.7164727.
[7] O. Al-Debagy, P. Martinek, A comparative review of microservices and monolithic architectures, in: 2018 IEEE 18th International Symposium on Computational Intelligence and Informatics, CINTI, 2018, pp. 149–154, http://dx.doi.org/10.1109/CINTI.2018.8928192.
[8] F. Tapia, M. Mora, W. Fuertes, H. Aules, E. Flores, T. Toulkeridis, From monolithic systems to microservices: A comparative study of performance, Appl. Sci. 10 (17) (2020) http://dx.doi.org/10.3390/app10175797.

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (pg. 1)

We observed that the modularization phase requires most of the refactoring effort, while the impact on latency is already significant in the absence of remote invocations between modules.

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (pg. 2)

Ueda et al. [5] compare the performance of microservices with the monolith architecture to conclude that the performance gap increases with the granularity of microservices, where the monolith performs better. Villamizar et al. [4] show different results, concluding that in some situations the performance is better in the microservices context and that it reduces the infrastructure costs, but the request time increases in microservices due to the gateway overhead. Al-Debagy and Martinek [7] conclude that they have similar performance values for average loads, and the monolith performs better for small loads. In a second scenario, they found that the monolith has better throughput but similar latency when the system is stressed in terms of simultaneous requests. Bjørndal et al. [14] benchmark a library system that has four use cases and considers synchronous and asynchronous relations between microservices. They observe that the monolith performs better except for scalability. Therefore, they identify the need to carefully design the microservices, in order to reduce the communication between them to a minimum, and conclude that it would be interesting to apply these measures to systems that are closer to the kind of systems used by companies. Guamán et al. [15] designed and implemented a multi-stage architectural migration into a microservice architecture and compared the performance of the monolith and the intermediary stages with the microservice architecture. They concluded that the microservice architecture has worse latency. Flygare et al. [16] found that in their case study, the monolith performed better in terms of latency and throughput while consuming less resources than the microservice architecture. Additionally, they also observed some throughput benefits of running the microservice in a cluster instead of running on a single computer.

[14] N. Bjørndal, A. Bucchiarone, M. Mazzara, N. Dragoni, S. Dustdar, F.B. Kessler, T. Wien, Migration from monolith to microservices: Benchmarking a case study, 2020, http://dx.doi.org/10.13140/RG.2.2.27715.14883, unpublished.
[15] D. Guaman, L. Yaguachi, C.C. Samanta, J.H. Danilo, F. Soto, Performance evaluation in the migration process from a monolithic application to microservices, in: 2018 13th Iberian Conference on Information Systems and Technologies, CISTI, IEEE, 2018, pp. 1–8.
[16] R. Flygare, A. Holmqvist, Performance Characteristics Between Monolithic and Microservice-Based Systems (Bachelor’s thesis), Faculty of Computing at Blekinge Institute of Technology, 2017.

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (pg. 2)

Some other perspectives compare the performance of monolith and microservices systems in terms of the distributed architecture of the solution, such as master–slave [17], the characteristics of the running environment, whether it uses containers or virtual machines [6], the particular technology used, such as different microservice discovery technologies [7], or other aspects of microservices deployment [8].

[17] M. Amaral, J. Polo, D. Carrera, I. Mohomed, M. Unuvar, M. Steinder, Performance evaluation of microservices architectures using containers, in: 2015 IEEE 14th International Symposium on Network Computing and Applications, 2015, pp. 27–34, http://dx.doi.org/10.1109/NCA.2015.49.

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (pg. 2)

A major aspect of performance is the type of inter-microservice communication and the technology used. Hong et al. [18] compared the performance of synchronous and asynchronous communication and concluded that the asynchronous approach offered a more stable overall performance but a lower response request performance. Fernandes et al. [19] presented similar results, with asynchronous communication outperforming REST communication in performance and data loss prevention in a large data context. Shafabakhsh et al. [20] leverage on Fernandes et al. [19] research and conclude that there is a benefit of synchronous communication under small loads.

[18] X.J. Hong, H.S. Yang, Y.H. Kim, Performance analysis of RESTful API and RabbitMQ for microservice web application, in: 2018 International Conference on Information and Communication Technology Convergence, ICTC, IEEE, 2018, pp. 257–259.
[19] J.L. Fernandes, I.C. Lopes, J.J.P.C. Rodrigues, S. Ullah, Performance evaluation of RESTful web services and AMQP protocol, in: 2013 Fifth International Conference on Ubiquitous and Future Networks, ICUFN, 2013, pp. 810–815, http://dx.doi.org/10.1109/ICUFN.2013.6614932.
[20] B. Shafabakhsh, R. Lagerström, S. Hacks, Evaluating the impact of inter process communication in microservice architectures, in: QuASoQ@ APSEC, 2020, pp. 55–63.

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (pg. 2)

Haywood [21] introduces the modular monolith and compares it with the microservice architecture. The modular monolith decomposes the monolith into a set of modules that do not share any data and interact through well-defined interfaces. In reality, the modular monolith can be seen as an intermediate step for the migration of a monolith to a microservice architecture, where modularity is achieved but independent scalability of modules is not possible.

[21] D. Haywood, In defense of the monolith, in: Microservices Vs. Monoliths - The Reality Beyond the Hype, Vol. 52, InfoQ, 2017, pp. 18–37, URL https://www.infoQ.com/minibooks/emag- microservices- monoliths.

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (image) (pg. 3)

Two-step migration

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (pg. 3)

Modularization of a monolith requires the decomposition of its domain model into different modules, so that each of the modules does not interdepend on a shared data repository [21], and the interactions are done through the module interfaces. The domain model represents the persistent domain entities of the application, usually defined using an Object-Relational Mapper (ORM) [22].

[22] M. Fowler, Patterns of Enterprise Application Architecture, Addison-Wesley Longman Publishing Co., Inc., 2002.

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (pg. 4)

The following performance optimizations are applied:

  • Associate a database access index to the unique identifiers of domain entities exported to other modules, due to the frequent invocations to obtain a DTO object of the domain entity, given its unique identifier.
  • Whenever a DTO is generated, by default, it is loaded with the values for all its attributes, which work as a cache, to reduce further inter-module invocations to obtain each one of its values. This introduces memory overhead because some of the attribute values may not be necessary in the context of that particular interaction. However, in general, after systematically applying this tactic, the performance of the system improves.
  • In some cases, after performance evaluation, it is necessary to have larger DTO objects, which, in addition to the attribute values, also contain the information associated with several interrelated domain entities. For instance, a DTO containing a list of DTOs may reduce the number of inter-module invocations when a module is interacting with a collection of domain entities in another module.
faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (pg. 5)

Similarly to the modular monolith, the definition of several microservices and the introduction of synchronous communication in the architecture raise performance problems associated with the number of remote invocations per functionality and the amount of information transferred in DTOs. This is due to the extra latency resulting from remote communication, which is significantly higher than in the modular monolith. The following optimizations are applied:

  • Implementation of caches in microservices for faster information retrieval speed and to reduce the number of remote invocations for faster performance. Some of these caches are created upon microservice creation and the data is preserved until the microservices stop their execution, this is very effective when the chosen data remains immutable throughout the execution;
  • When necessary, DTOs containing the information of several interrelated domain entities have to be defined to reduce the number of remote invocations, though it increases the amount of information sent in a single request.
faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (image) (pg. 7)

Performance results for sequentially executing 50 times each functionality for 100 and 720 fragments in the database while running inside Docker containers. Results are separated by/in each cell; for instance, by sequentially executing 50 times the Source Listing functionality in the monolith, we observed an average latency of respectively 17, and 61, milliseconds, where there are respectively 100, and 720, fragments in the database (17/61). We ran the experiment 50 times to obtain an average, and we used 100 and 720 fragments to observe the effect of the amount of data being transferred.

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (pg. 7)

For each functionality, a test case was designed to compare the performance of all three architectures. Each test case run simulates a user that sequentially submits 50 requests after a first request to warm up the caches. The test cases are run for two different loads of the database, for 100 and 720 fragments, respectively. The results can be comparable in a single machine because the requests are sequential. The results are shown in Table 4.

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (pg. 7)

In terms of performance comparison between the modular monolith and the microservice architecture, it can be observed that the microservice architecture has a severe negative impact on performance, both in terms of latency and throughput, independently of the functionality and information in the database.

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (pg. 8)

These performance values are a consequence of the number of remote invocations necessary to implement the functionality. In each request/response of a remote invocation, the latency values increase due to network overhead, such as serialization/deserialization times, which severely affect the performance as the amount of information increases, showing a severe drawback of remote invocations.

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (pg. 8)

Surprisingly, the microservice architecture did not impact the performance of the Assisted Ordering functionality (Table 4, −7% for the variation for modular microservices when 720 fragments). This is mainly due to two reasons. First, the functionality is computationally demanding, which reduces the impact of distributed communication on overall performance. Second, the use of caches proved to be effective in optimizing the performance of the functionality.

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (pg. 8)

Overall, when comparing the performance of the monolith with the microservices system, and although all optimizations, the differences were significant for all functionalities, varying the latency between 201% and 6977%. This is the consequence of the introduction of additional network overhead caused by remote invocations. Note that a major performance bottleneck of remote invocations came from the need to serialize and deserialize DTOs on each invocation because it introduces additional latency that becomes more noticeable as the amount of information increases. For instance, we measured and observed that even in coarse-grained communication, the serialization/deserialization time of the Source Listing functionality corresponded to 82% of the average latency of the functionality, reaching a serialization time of over 3000 ms and a deserialization time of 560 ms. This has a considerable impact on the performance of the functionality, which further increases the latency of the functionality in the modular monolith.

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (pg. 8)

So far, we have analyzed the impact of migration on performance but not on the scalability aspects associated with the possibility of independent scalability in microservice architecture. Therefore, another testing setup was necessary. This testing was carried out with the microservices being deployed in a Google Kubernetes Engine cluster with 8 nodes, 16 vCPU and 32 GB of memory. To allow a comparison with a deployment without independent scalability, two run-time architectures were considered: a single instance multi-instance deployments. The former is composed of a single instance of each microservice, and the latter is composed of five instances of each of the Text and Virtual microservices, and a single instance of the remaining microservices. There are also five instances of the front-end that send requests. This allows us to evaluate how increasing the resources of specific microservices affects performance. Additionally, to allow comparison with the previous experiment, two different workloads were considered: a sequential workload and a concurrent workload. The former is similar to the local experiment and the latter simulates 50 different users concurrently invoking the functionalities.

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (pg. 8)

It can be observed that there are some benefits and drawbacks to the different runtime versions of the architecture under different usages of the application. In terms of concurrent workload, there is a significant increase in the throughput of running multiple instances of specific microservices for all three functionalities.

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (image) (pg. 9)

Performance results for sequentially executing 50 times each functionality and for 50 users concurrently executing each functionality for 100 and 720 fragments in the database while deployed in the Google Kubernetes Engine cluster. The results are separated by/in each cell, for instance, by sequentially executing 50 times the Source Listing functionality, we observed an average latency of respectively 1768, and 21 806, milliseconds, where there are respectively 100, and 720, fragments in the database (1768/21 806).

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (pg. 9)

Overall, we could observe a scaling benefit of a microservice architecture; however, there was a significant performance degradation of running the microservice application in a cloud environment compared to our local deployment. Despite the throughput increase of the multi-instance version, the latency values were significantly high. Especially for functionalities with large amounts of information such as Fragment Listing and Source Listing, resulting in a general bad user experience. This is due to the additional network overheads that are introduced with remote invocation through a real network, which are not suitable for large payloads of information or fine-grained invocations.

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (pg. 10)

To improve performance, different caches were implemented in the modular and microservice architectures. However, throughout their use in the application, these caches might become inconsistent as the information changes and thus affect the consistency of the application. Therefore, it is also necessary to perform an evaluation of the consistency of cache data and its impact on application behavior. In particular, it is necessary to analyze the immutability level of the data in cache and its frequency of change.

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (pg. 11)

Migration to a microservice architecture revealed a significant impact on the application for both refactoring and performance. Most of the migration cost was connected to the modules and the interfaces to implement microservices with the desired intermicroservice communication, in which the quality of the modules and interfaces has a significant impact on the cost. On the other hand, a serious consequence of the migration was the large impact on performance associated with inter-microservice communication, which required functionalities to be reimplemented to have coarse-grained interactions.

faustino.etal.2024.stepwisemigrationmonolithmicroservicearchitecture (pg. 11)

In the LdoD Archive, the migration had a serious impact on performance due to network overhead that proved to be far too high compared to previous architectures but offered a more scalable and manageable architecture.