Fewer unit tests, more logs

As you can imagine, fast software delivery involves cutting corners, or should I say, “corner cases”. You put yourself and your software team at risk of being attacked by hordes of unexpected errors coming from both external and internal users; think: unhandled form data, duplicate orders, records removed by mistake, a super important request from a big fish client, and the list continues. All of a sudden, the manual work you’ve taken away from the other departments ends up on your plate. Now, how to fix it?

Some software conservatives would say: “cover your crappy code with a thick safety net of unit tests and you'll get rid of unimplemented corner cases". Easier said than done? Well, our standpoint is that it makes little sense to add unit tests at this stage. First, the code changes so frequently that the tests would become obsolete next week or sometimes within a day. Second, test-driven software development requires discipline and process. And they take time. And there’s no time in an early stage business like Manufaktura.

So, let’s see what measures you can leverage to reduce error handling work in your team and, what’s more important, improve the quality of your service after all. We will describe the low hanging fruits first and in the last section, point you in the right direction when it comes to heavyweight monitoring tools.

Logs

Unified structure

Unified logging structure is invaluable for error handling. For a business app like Manufaktura I dare say it saves you (as a developer) more time than any collection of tests no matter if they are unit or end-to-end. But it’s also beneficial for the security and devops team later on when your business matures.

There are some good resources on logging so give them a thorough read. On top of that, we’d like to list the most important rules based on our experience:

Every dev uses the same, unified log structure
Every log has a unique context identifier; you can use timestamp, business components, user ID, order ID, web session ID etc.)
Every dev pays special attention to detailed error logging
It starts you measuring execution time of the most important / popular features; it might be useful when the scale hits

Detailed error logging example:

logger.error("[EmailService] Email not sent - Order: %s Account: %s Message: %s Error: %j Stack: %j", order.Id, account.Id, error, error, error && error.stack);

{{ENDCODE}}‍

Log aggregators

Now, let’s see why these rules are so important. Logging makes little sense if there isn't an easy way to analyze it. One way forward is to dump your logs into files on your server, login via ssh and browse with your favorite bash spells. But this can be time-consuming and the analysis itself might not be intuitive. Also, you have to manage the storage problem yourself.

Another approach is to set up a profound log aggregator based on top of any of the popular open source tools; ELK and graylog are of particular relevance here. But again, the setup and hosting take time, especially if you don’t have a Linux wizard onboard. Additionally, alerting is not out of the box - you have to configure and maintain another package.

And it is the alerting which is the key takeaway I want you to leave with. Being able to define and be notified about critical error messages is the backbone of a sustainable monitoring system for an early stage system.

Imagine the time saved if you can select which errors are actually important to you and if you can get the error context in a well-formatted email or a dedicated Slack channel.

So, how can we get email/Slack/SMS alerts fast? The answer is: we’ll employ a SaaS platform (you might have already noticed this pattern in our series).

The log aggregator market has matured over the last couple of years. You can choose one of many tools. However, we found LogEntries to be the best for most of our projects. We’ve been using it for over 3 years now, and it’s been totally worth the price. Why LogEntries? Because apart from the alerting module, it also offers several other time-saving features like:

SQL-Like query language for searching.
Aggregated live tail search.
Custom tags of logs.
Works with multiple PaaS (heroku addon) and IaaS.
Has the ability to aggregate logs from different applications/services.

Let’s get back to the alerts though. Creating email notifications is super simple with LogEntries. You just define a tag using built-in filters or regex and then define which tag should send an email and who gets it. It’s worth noting that you can also adjust the frequency.

Adding an alert for better handling security

In an online marketplace business like Manufaktura, the CTO will be notified about expected and unexpected errors:

Expected – sometimes you know a particular case isn’t implemented yet (because of priorities*) but it happens so rarely that a simple, manual db update is sufficient. You just need to be informed early enough.
Unexpected – all other stuff like when the server responds with 500 or with a timeout.

*As we’ve mentioned in previous posts, prioritizing which cases should be implemented first or which features should be shipped at all is a skill in itself. The only way to learn it is the hard way - through experience. The good news is that with the error alerts you just got a handy tool to reduce the impact of the wrong choice.

Plug’n’Play monitoring tools

Imagine you can get a complex application performance dashboard, including metrics like:

Total number of requests
Transaction execution time
Database query execution time
Latency around the world

By writing only a couple lines of code. This is possible with the plethora of app performance SaaS tools. One of the most popular and most mature is New Relic. It supports many programming languages and has grown hundreds of integrations including database-, browser-, infrastructure-, mobile-specific plugins. But it’s pricey at the same time. That’s why it’s good to take a look at alternatives.

Anyway, at the early stage you don’t need most of the New Relic features. You can go around with just the APM module. And if you host your platform on heroku, New Relic has an interesting $49 offer for you:

What’s also nice about New Relic is the alerting module. Similar to LogEntries alerts, you can subscribe for expected unexpected situations. Like spikes in the traffic which your application cannot handle yet. This gives you a way to react before the shit really hits the fan, e.g. you can scale-up your infrastructure for the increased traffic period or try to queue jobs and process them later.

Alerts management

Pager Duty

The basic LogEntries and NewRelic notifications run on email. For both tools, you can also add a Slack channel through webhooks. Unfortunately these aren’t much use when you sleep. And your platform might have one or two super-critical business processes you don’t want to be down even for a minute.

Pager Duty handles this. It gives you SMS/push notification alerts starting at $9 a month. The price goes up if you want to add phone call alerts or if you want to apply on-call scheduling rules for your team. For example, you can make Pager Duty call Tom first, if he doesn’t pick up (acknowledge to Pager Duty that he reacted), it calls Dick, and finally Harry.

Moreover, when you go to their integrations page, you’ll find both LogEntries and New Relic among more than 200 other connectors. The integrations and the triage support give you a simple way of connecting a particular error to a responsible developer.

Summary

These 3 SaaS monitoring tools are good value for money investments in an early stage online business. The configuration and hosting doesn’t require a dedicated administrator and they all have offers for small teams (LogEntries $39, New Relic $49, PagerDuty $9). What you get is priceless - reduced manual work for your dev team and the quality of service increases. Having such a thick safety net, our Manufaktura is ready to develop more power features. In the next article, we’ll tackle one of these - email and SMS communication.

Now, when the number of features grow, the infrastructure swells and so do the bills from our SaaS monitoring providers. That's when you might want to reconsider your monitoring toolset and tap into self-hosted open source products. These are some of the market leaders:

Prometheus – a systems and service monitoring system. It collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts if some condition is observed to be true.
Grafana – allows you to query, visualize, alert on, and understand your metrics no matter where they are stored (integrates nicely with Prometheus).
Consul – a tool for discovering and configuring services in your infrastructure and managing health checks.
ELK – centralized logging system based on ElasticSearch, LogStash, and Kibana.
Graylog – another log management, requires ElasticSearch and MongoDB.

Mike Sedzielewski

Co-founder of Voucherify and rspective. Fan of API-solutions for marketing automation and headless approach in digital marketing and sales. In private, he enjoys reportage and everything non-fiction.

Get inspiration to grow

Fewer unit tests, more logs

Logs

Unified structure

Log aggregators

Plug’n’Play monitoring tools

Alerts management

Pager Duty

Summary

Katowice

New York

Product

Scenarios

Resources

For Developers

Company

Get inspiration to grow

Get inspiration to grow

Fewer unit tests, more logs

Logs

Unified structure

Log aggregators

Plug’n’Play monitoring tools

Alerts management

Pager Duty

Summary

Related Voucherify Tech articles

Delivering IT for an Immensely Successful Startup

Documenting API with readme.io

How to Automate Email Creation?

Join our newsletter