13. ULUSLARARASI MÜHENDİSLİK MİMARLIK VE TASARIM KONGRESİ, İstanbul, Türkiye, 8 - 09 Haziran 2024, ss.10-20
As the popularity of microservices-based architectures increases, the need to test the resilience of systems has also become important. Many studies have been conducted that examine in depth basic concepts such as security, scalability and performance. However, the rapid development of technology has brought about difficulties in ensuring the reliability of these architectures. To overcome these challenges, innovative methods such as chaos engineering are becoming increasingly important.
Chaos engineering is a method of deliberately introducing faults or outages to different parts of the system to test the system's resilience and identify possible weak points. This method is especially indispensable for architectures with complex connections. This article highlights the importance of chaos engineering for microservices architectures and provides an overview of commonly used tools and technologies for chaos experiments. It also outlines best practices for planning and executing chaos experiments and discusses the benefits and challenges of using chaos engineering in practice. In order to commit chaos experiments a project has been used as a testbed which highly consumes microservices, clusters, mailAPI, authentication API, frontend API and backend API using python technologies. Various chaos experiment scenarios have been introduced to the project, including Pod-based, Network-based, and Utilization-based experiments, utilizing open-source software such as Chaos Mesh and Litmus.
Keywords: Microservices, Resilience, Durability, Chaos Engineering, Fault Tolerance, Simulation,
Kubernetes, Cluster, Pod