Presented by

  • Matthew Oliver

    Matthew Oliver
    @mattoliverau
    https://oliver.net.au/

    Matthew is a senior systems software engineer working at NVIDIA, where he primarily works on upstream Openstack as a Swift core. Based in Melbourne Australia, he has been hacking on Swift since 2014. Before NVIDIA Matthew worked at Suse, where he worked on both Swift and Ceph upstream, so has been working in the Object Storage space for a while. Matthew was the co-founder of the Kororaa Linux distribution which has given him careers in both Linux system administration and software development.

Abstract

Making life easier for SRE's and Ops is important, so is visualising component interactions inside a cluster to see where improvements can be made to help drive development focus, and let's face it seeing graphs and visual traces is fun :) Swift can log very verbosely but on production, especially with very large clusters you don't want to turn up your logging too much. Especially if a customer is having an issue, sometimes all you can do it come up with a hypothesis from the logs and then test in staging or dev environments. But what if you could start tagging a request through the cluster. Better, what if that trace was integrated into the software itself so we can breakdown not only the inter node requests but delve into whats happening on the node itself? Well that's exactly what I've been playing with. What started out as middleware bench-marking, and sharing initial results with our SREs has snowballed into request tracing... and to be honest, it's pretty fun. Now we can see: - where a request spend it's time. - Start getting a visual understanding of what different requests look like in the cluster - Use the information to better tune the configuration and topology of the cluster - Find areas where we need to put more developer time to optimise different code paths.