Skip to content
#

site-reliability-engineering

Chaos engineering is the discipline of experimenting on a software system in production in order to build confidence in the system's capability to withstand turbulent and unexpected conditions. Chaos engineering is a disciplined approach to identifying failures before they become outages

Here are 62 public repositories matching this topic...

howtheysre
chaos-mesh
STRRL
STRRL commented Feb 11, 2022

The library controller-runtime requires setting a logger (by log.SetLogger()) at the first 30s when the application starts, or it would use the default NullLogSink. We should also call it in testing codes.

When we test with ginkgo, ginkgo provides a helpful GinkgoWriter, which hides the output as default, only prints it when the test failed. We'd better use it to keep our testing output

glsutter
glsutter commented Dec 8, 2020

Issue Description

Question

Describe what happened (or what feature you want)

Trying to evaluate ChaosBlade as an option for resiliency testing. But I'm not sure if this is a feature request or a question. Actually, two questions:

  • Does ChaosBlade support Azure, or can it be extended to support Azure?
  • Can ChaosBlade inject failures into a Platform as a Service (Pa
good first issue type/feature
litmus
plajjan
plajjan commented Jul 2, 2021

It seems to me that UTC is selected for on the wire representation of time as well as in the database (jaegertracing/jaeger#712), which sort of makes sense, at least with a somewhat naive handling of timezones. However, I think that the Jaeger UI should support displaying times in the timezone local to the user, i.e. of the browser as to reduce the mental load when viewing

anishasthana
anishasthana commented Jan 19, 2022

This is a rewrite of #129 to make it easier to parse :-)

Background

Prometheus is de-facto standard for monitoring applications in the cloud native space. One of the core conceits here is the idea of "time-series" data (look at the Prometheus docs to get a better idea) for metrics. At a high level, you can just think of it as a continues series of values for som

good first issue

🔖 Daily-updated reading list for designing High Scalability 🍒, High Availability 🔥, High Stability 🗻 back-end systems - Pull requests are greatly welcome 👬 I hope you will find this project helpful 🍀 Please help me share it to more and more people ❤️ Thank you - 谢谢 - धन्यवाद - ধন্যবাদ - Спасибо - شكرا - Merci - Gracias - Danke - Cảm ơn! 🙇

  • Updated May 17, 2018
Wikipedia
Wikipedia

Related Topics

sre testing