Hi everyone, getting through a lot of firsts this week:

  • First time in Portland
  • First time at Monitorama
  • First time speaking at Monitorama
  • First time speaking at a conference
  • First time I was solo on stage since my high school talent show (a bass solo featuring Jimi Hendrix’s “Purple Haze”)
  • First blog on Medium
  • First time publicly sharing details of things I’ve worked on

That last bullet point is something I have really wanted to change. The reason I haven’t so far is partly due to my career being comprised of projects that I couldn’t publicly talk about and partly because I wasn’t sure I had much to contribute to the conversation. If I’m going to say something, I’d rather there be a high signal to noise ratio. Given the quality of speakers and content at Monitorama this year, it was an honor and a joy to be able to get out there and share some of my/our experiences.

As a little background, I wasn’t one of the initially chosen speakers. I had already purchased tickets and was really looking forward to attending. Jason reached out to me a little over a month before the event because a speaker slot had opened up and my CFP was highly ranked. I jumped at the opportunity, but also knew that I needed to get to work pulling a talk together.

I had been meaning to setup an internal class on JVM profiling for a while, so I gathered some content on some of the issues we had seen within the Observability Team. I highlighted how to use our stack to find where problems were and how drilling deeper using profiling tools allowed us to resolve those issues. I went a bit deeper on what each of the features of the tools allowed you to do and how it made the discovery process a little simpler. I also wanted to highlight some of the issues that I’ve run into in my ~10 years of doing this kind of work. That was the basis for my Monitorama talk and I tried to distill things to the more interesting pieces for public consumption. I hope to work with my team to write up a tech blog where we can go a little more in depth in those areas. My teammates were incredibly supportive and really helped improve the quality of my presentation, I couldn’t have done this without them. This post will focus more on my thoughts and experiences, less on the technical side.

The Conference

I’m not sure I could have enjoyed the talks more. It was also great getting to connect with the other speakers and attendees. The speakers dinner and treatment by the organizers is worth the nerves of submitting a CFP. It was really cool being able to catch up with people, talk openly about some of the issues we are facing, and consider where things are going.

I left feedback for all of the speakers and others have left some great notes on highlights. I could spend all day listing out what I pulled from each talk — and there was something that should have sparked some thought from every speaker — so I’ll stick to some high level things.

  • People are important, be good to them and be good to each other
  • Context and discoverability are important and there is a massive gap in our existing tools
  • User experience is paramount to success
  • Big players invest in infrastructure and Observability, but all have a different take on a solution

If you weren’t in attendance, you might not be aware that the second day of the conference was majorly impacted by a power outage. There was an underground electrical fire that took power out for a day to multiple blocks surrounding the venue (including my hotel room). This was following Alice Goldfuss’s talk, which referenced Centralia, PA (a town that contains an underground coal mine fire that has been burning since 1962). The event organizers pulled some miracles and everything continued on schedule, despite a venue swap on Tuesday.


Prior to speaking, I felt confident enough in myself, as I had done a few practice sessions in my hotel room. Unfortunately, I also had less than five hours of sleep pretty much every night for the week leading up to my talk. There were a few things I intended to touch on but missed, along with some mistakes. Once I was on stage, there wasn’t much thinking going on. Pretty much everything out of my mouth was adrenaline or instinct, so I’m glad I left the few speaker notes that I did. I know with experience this will get better, more comfortable. I had the benefit of being in a “lower pressure” spot, given that I was the first speaker after lunch on the last day. I also knew that I could just put everyone into a lull for the rest of the conference, so some pressure back up lol.

I said “umm” and “uhh” way too many times. When I referenced Julia’s presentation, I could have used a better transition (what I meant was to “build on top of”, in the moment “complex” was what popped out of my mouth). My font size was a little too small in some slides — I’m working on getting those slides out. I should prefer more bold text and speak to fill in the details, as opposed to filling pages with text.

Fml, I said “you guys” when asking about the poll to Open Source our stack 😣. Unfortunately there’s no edit button on a live performance. No one pointed this out to me, but I noticed while re-watching my talk for critique. This was completely unintentional and something I was consciously trying to avoid prior to the talk. I wanted this to be inclusive and it’s something I have to work out of my vocabulary. I’m certainly not perfect and will continue to work to try to better myself. I can only hope to have the opportunity to continue to build on these mistakes.

I also meant to touch on the fact that my talk actually branches out from previous Twitter talks at Monitorama.

Caitie’s presentation last year was discussing on call burnout and how to address those problems. At the end, she touched on our indexing service as starting to result in more pages. That was the same service I discussed during the final case study.

In my first example, the service that’s diving is part of our Mon 2.0 stack, which Megan, Justin, and Dan talked about last year. The lingering hosts that were falling over less frequently are caused by a lack of balance in how expensive rules/queries are evaluated. This is something Dan discusses in that talk as keeping the solution simple. Which is how you SHOULD approach the problem — introduce complexity as necessary and to solve a known issue.

My hope when sharing my presentation was to give a more realistic view of what issues we encounter. It wasn’t meant to be a sales pitch or to sell a specific tool, as these issues should cross cut and the approach should be generally applicable. It was also meant to show just how much is missing from getting from a dashboard or alert to resolving an issue. These problems take expertise and experience to diagnose (or prevent from entering your systems in the first place). I work with some extremely intelligent people, we all still make mistakes. My days aren’t filled with profiling — I spend most of my time working on architecting solutions, delivering new features, researching new ideas, clearing out tech debt, refactoring cruft, code reviews, documentation, on-call, support chat, communications, etc, etc. Profiling and performance analysis is an invaluable tool that everyone should have in their toolbox.

I want the experience of all of our consumers to improve time to resolution, which is difficult to quantify accurately. I would also prefer that others don’t have to repeat some of the same mistakes or issues we’ve resolved.

This leads me to the Open Source poll. I was thrilled with the response. It wouldn’t be worth trying to do if a community wasn’t willing to build around it. We will see where things go from here and I’m looking forward to discussing with everyone who has reached out.

If you missed it, you can watch my talk here:

Thanks for reading and thanks to Monitorama for such a fantastic experience. See everyone at Monitorama 2018? Now back to packing to move and catching up on rest 😎.