The Boring Guide to IT for Small & Medium Businesses

Most writing about IT is aimed at companies that employ twelve people whose entire job is IT. That's useful if you're running a two-thousand-person engineering organisation. It's not useful if you're a founder, a head of operations, or the person at a twenty-person company who happens to be "good with computers."

This guide is for the second group. It's everything you need to know to run the infrastructure of a small business without burning time, budget, or a weekend when something silently breaks.

We'll cover what to monitor, how alerting should actually work, what you can safely automate, how to avoid pricing traps, how to manage vendor dependencies, and how to sequence all of it so you don't spend six months setting up things you don't need yet.

No enterprise frameworks. No five-stage maturity models. Just the boring parts of running IT that actually matter.

Why IT for small businesses is different

When you read about IT operations in most places, the assumptions are invisible but enormous. There's a dedicated security team. A platform team. An on-call rotation. A budget that absorbs $50k ARR tool purchases without a flinch. A compliance function. Change management. A CAB.

Small businesses have none of that, and shouldn't pretend otherwise.

What you have instead is a small number of systems, a small number of people who touch them, and very little tolerance for things breaking. The usual advice — "build an SRE culture," "adopt SOC 2 processes early," "implement zero-trust networking" — is either wrong for your scale or so expensive to implement properly that it becomes theatre.

The good news is that you don't need any of that to be in good shape. You need to monitor the handful of things that actually break, alert the right person when they do, automate the obvious stuff, and keep a lid on ongoing costs. The list is shorter than the enterprise world would have you believe.

The trap is copying practices from companies a hundred times your size. The opportunity is doing the boring, practical basics better than most of your peers — because most of them aren't doing them either.

What to actually monitor

Start with this question: what would make a customer contact you if it broke?

For almost every small business, the answer is one or more of:

The website being down
Email not arriving (yours or theirs to you)
The domain suddenly not resolving
An SSL certificate expiring and showing a scary warning
A key vendor going down and taking part of your service with it

That's it. That's the list. Everything else is secondary.

The things that most frequently cause problems are also, in our experience, the things least likely to be monitored. Domains quietly auto-renew-fail because the card on file was replaced. SSL certificates issued for ninety days expire because nobody updated the renewal automation. DNS records get edited by someone long gone and then break months later. Vendor outages take down a critical third-party feature and you only find out when customers complain.

There's a whole separate question of what silently breaks first when nobody's looking — the kind of failure that lingers for weeks without anyone noticing. We wrote about that in what breaks first when nobody's looking, because the pattern of silent failure deserves its own treatment.

For now, the baseline monitoring list for almost any SMB is:

Uptime of your primary website — a simple HTTP check every minute is fine. Website Uptime Monitor runs this for you across regions.
SSL certificate expiry — check weekly; alert when under 30 days. SSL Certificate Expiry validates the full chain on any domain.
Domain expiry — check daily; alert aggressively starting at 90 days before expiry. Domain Expiry Watcher handles this end-to-end.
DNS records — snapshot them and alert on unexpected changes. DNS Monitoring Tool covers this.
Email authentication (SPF, DKIM, DMARC) — check weekly; alert on breakage. Email Deliverability Checker audits the whole stack in one pass.
Vendor status — subscribe to status pages of anything your business depends on. Is That Down aggregates the public ones in one place.

This is not an exhaustive list. It's the load-bearing list. Get this right, and the odds of a quiet-failure disaster drop by something like 90%.

Anything beyond this — application performance monitoring, log aggregation, endpoint monitoring, user behaviour analytics — is useful but optional. Set those up when you have a reason, not because a blog post told you to.

How alerting should actually work

Monitoring without good alerting is just a dashboard nobody looks at.

The hard part of alerting isn't technical. It's figuring out which signals are real, which are noise, who should see them, and what action the alert implies. Most SMB alerting setups fail at the human layer long before they fail at the technical one.

Three principles, learned the hard way:

Alert only on things that require action. If an alert fires and no one does anything differently, delete the alert. It's just training people to ignore the channel it goes to.

Route each alert to the single person who can fix it. Not "the team." Not "engineering." One human, one alert, one obvious next step. This fails when ownership is fuzzy, which is why so much SMB ops feels chaotic.

Make the severity honest. Everything-as-a-pager-alert and nothing-as-anything-less is a fast road to alert fatigue. Use tiers: an "info" channel for things you review in a morning standup, a "warn" channel for things you deal with today, and a genuine pager for things that require response now.

We've written elsewhere about the 3 AM alert problem — what happens when the noise from bad alerting trains your team to sleep through the real ones — and about how alert fatigue quietly becomes a risk rather than just an annoyance. Both of them are spokes off this pillar; both are worth reading if you've ever been woken up by a disk-usage warning.

The short version: fewer, better alerts. Delete ruthlessly. Tune based on what actually happens during incidents.

What to automate, and what to leave alone

Small businesses have a particular flavour of automation mistake: they automate the wrong things.

What gets automated is usually the thing the founder or ops person finds annoying right now — a deploy script, a report, a bit of data entry. What should get automated is the thing that silently kills you if forgotten — SSL renewal, domain renewal, payment-method updates at your registrar, backup verification.

The principle is: automate the boring-but-critical, not the interesting-but-optional.

Good candidates for automation include:

SSL renewal (Let's Encrypt + a renewal cron, or a managed provider that does it automatically)
Domain auto-renewal, with a monitoring check that verifies it actually happened
Database backup and backup verification (an unverified backup is not a backup)
Certificate and secret rotation, where practical
Routine security patches on infrastructure you manage

Things that look automatable but are often better left manual at small scale:

Customer-data migrations
Anything involving payments, refunds, or money movement
Cross-team process handoffs where the "process" is mostly tribal knowledge

The set-and-forget philosophy isn't about being lazy — it's about making the system robust enough that forgetting is safe. If forgetting a task for a week would break something, either automate it or don't forget it. Don't do the middle option, which is "hope."

Pricing traps to avoid

Small businesses have a specific vulnerability in how infrastructure is priced: they're sold tools priced for enterprises.

The two most common traps:

Per-seat pricing on tools you'll outgrow. You start with five seats. You end up with twenty-five. Suddenly your monitoring tool is $500/month. This is usually fine for a tool you love. It's painful for a tool you tolerate.

Per-asset pricing on monitoring tools. You start with three domains. Your portfolio grows to fifteen. Your alerts tool is now charging you per domain, per check, per alert destination, and the bill has quintupled. This one catches nearly every growing SMB, which is exactly why per-asset pricing is a scam in practice, even when it isn't obviously predatory.

Defences:

Prefer flat-rate pricing for tools you expect to grow into.
For tools that charge per asset, model out the cost at 3x your current scale before signing.
Audit your recurring spend quarterly. Something that was $29/month when you signed up five years ago is rarely still $29/month today.
Check whether your current tool's "enterprise" tier is actually needed, or whether you're paying for a feature you use once a year.

A related principle: the tools that look cheapest are often the ones with the most pricing surface area. Read the bill, not the landing page.

Vendor dependency management

Your infrastructure is made of other people's infrastructure. That's fine until it isn't.

Every SMB quietly depends on a web of third parties: hosting provider, DNS provider, email sender, payment processor, authentication provider, CRM, analytics, CDN. Most of them are fine most of the time. The problem is that when one of them isn't fine, you find out because your customers are complaining — not because you knew it was happening.

The fix is not complicated:

List your dependencies. A single document naming every vendor whose outage would affect your customers. You'll probably be surprised how long the list is.
Subscribe to their status pages. Almost every serious vendor maintains one. Route those alerts somewhere you'll actually see them, not your personal inbox where they die.
Know the fallback for each one. For some, there isn't a graceful fallback, which is important to know. For others, you can switch providers in minutes if you've planned.
Check in on critical vendors quarterly. Are they healthy? Still investing in the product? Not obviously about to be acquired and enshittified?

This isn't vendor management in the procurement sense. It's just knowing where your dependencies are so you aren't surprised. We go deeper into this in vendor risk is your risk, but for the purposes of this pillar: write the list.

DIY versus buy

A lot of SMB IT spending is either too DIY (rebuilding what could be bought for $40/month) or too purchased (enterprise suites the size of a small car).

Good rules of thumb:

Buy when: the problem is boring, well-solved, and not central to your business. Monitoring, DNS, email sending, backups, password management. The existing solutions are good, the pricing is reasonable, and your time is worth more than the saved subscription fee.

Build when: the problem is specific to your business, your team already has the capability, and the ongoing maintenance cost is bounded. A custom admin tool, a simple data pipeline, an internal script that codifies some domain-specific logic.

Neither when: the problem is real but you don't actually need to solve it right now. This category is underrated. Lots of "we need X" is really "we might need X in a year, and we can revisit then."

The trap most small businesses fall into is the opposite of what you'd expect. They don't over-build — they over-buy. They sign up for enterprise platforms priced to be used at scale, use 10% of the features, and pay the full bill anyway. The case for monitoring on the cheap (without going full DIY) is stronger than most vendors would like you to believe, and it generalises beyond monitoring.

The minimal SMB monitoring stack

Site Watcher covers domains, SSL, uptime, DNS, and vendor status in one flat-rate subscription. No per-asset pricing. No enterprise upsell.

Try Site Watcher

Monitoring and your team

A monitoring setup is only as durable as the people who remember why it exists.

This sounds abstract. It becomes very concrete the first time someone leaves the company and takes with them the mental model of why a particular alert fires, who owns it, and what the expected response is.

Three failure modes to guard against:

The single-founder trap. Everything lives in one head. When that person is on holiday, nothing gets fixed, because nobody else can read the signals. If you're that person, the exercise is: if you were unreachable for two weeks, what would break, and who would know? The single-founder ops stack is a real thing, but it has to be documented enough that someone else can run it in an emergency.

The handoff gap. Someone leaves. Their alerts keep firing into a channel nobody watches. Or worse, into an inbox that bounces because their email was deactivated. Whenever someone moves teams or leaves, there's a monitoring handoff that needs to happen. Most SMBs don't have one, which is why we wrote about the monitoring handoff problem as a standalone piece.

The tribal knowledge tax. Every alert has context that isn't written down: "oh, that one fires every Tuesday, it's fine" or "that means the database is about to run out of disk." New people can't tell the benign weirdness from the real signal. Writing down the context — a simple runbook per alert, even two lines — pays for itself the first time someone new is on-call.

The common theme: monitoring is a team artefact, not an individual one. If it only works because one specific person is there, it doesn't really work.

The real cost of downtime

SMBs routinely underestimate how expensive downtime is. Not because they're bad at math — because the real costs are diffuse.

When you do the simple math — hourly revenue × duration — you get a number. It's usually not scary enough to motivate investment. "We do $X/hour, we were down for an hour, that's $X." Easy, tolerable.

The real cost is almost never just the direct revenue loss. It's:

Customers who churn silently — they didn't complain, they just never came back.
Customers who bring it up in the next sales call — "we heard you had an outage last month, is that normal?"
Support cost — the tickets generated by the outage, which can run for days after.
Engineering cost — the post-mortem, the hotfix, the scar tissue work.
Trust compounding — each outage makes the next one more expensive, because the story becomes "again."

For a consumer business, the real cost of a 1-hour outage is often 3-5x the direct revenue number. For a B2B SaaS with enterprise customers, it can be larger still, because a single contract can walk over cumulative reliability concerns. We lay the math out in the real cost of a 1-hour outage — the short version is that the direct-revenue number is a lower bound, not a reasonable estimate.

This matters for pillar-level decisions. The return on monitoring-and-alerting investment isn't "we'll catch outages faster." It's "we'll avoid outages that would have compounded into customer attrition."

Compliance without overkill

At some point, small businesses start hearing "SOC 2" and "ISO 27001" and "compliance" and assume they need a six-figure programme. They usually don't, at least not at the stage they first hear about it.

Compliance is genuinely useful when:

Your enterprise customers are asking for it in contracts.
You handle regulated data (health, financial, children) that comes with actual legal requirements.
You're fundraising and the diligence process includes it.

Compliance is security theatre when:

You're adopting a framework because it sounds serious, not because anyone is asking for it.
You're implementing controls you don't understand for problems you don't have.
Your compliance spend is a meaningful percentage of your revenue and you aren't in a regulated industry.

The practical posture for most small businesses is: build good operational hygiene now, and when compliance becomes a genuine requirement, most of the controls will already be in place. The compliance versus common-sense monitoring piece goes deeper, but the summary is that compliance without hygiene is performance, and hygiene without compliance still protects the business.

Practical controls that matter regardless of framework:

Audit logs on systems that handle customer or financial data.
Backup verification on a schedule you can point to.
Access reviews at least quarterly.
A written incident response process, even a one-pager.
Monitoring evidence — alerts, response times, resolution notes — in some form you could hand to an auditor if you had to.

If you do all of that without ever buying a compliance platform, you're better-positioned than most of your peers. If you later go through a formal audit, you'll bring receipts.

IT during due diligence

If you're fundraising, getting acquired, or running serious enterprise sales, someone is going to ask about your infrastructure. It's worth thinking about this before it happens.

What buyers and serious customers care about, in roughly declining order:

Data handling. Where does customer data live? Who has access? How is it protected?
Uptime and reliability. Not just the last 90 days — the story you tell about how reliability works.
Vendor risk. Who do you depend on? What's the blast radius if they go down?
Security posture. Even without a formal framework, can you articulate what you do and don't protect against?
Operational maturity. Can you respond to incidents? Do you have logs? Can you answer "what happened on [specific date]?"

The good news is that almost all of this is answered well by the boring basics in this guide. Decent monitoring with history. A vendor list. A runbook culture, even a thin one. Backup verification. A changelog of infrastructure changes.

What breaks due diligence isn't usually a missing framework — it's the inability to answer basic questions. "How do you know your customers' data is safe?" is not a framework question. It's a "walk me through the basics" question, and if you can't, it looks bad regardless of how many certifications you hold.

The full due diligence monitoring checklist is its own spoke; the pillar version is: if you're running the basics well, you'll look more mature than companies three times your size who bought the framework and skipped the hygiene.

Sequencing: what to set up first

A surprising amount of SMB ops dysfunction comes from doing the right things in the wrong order.

Here's an opinionated sequence, aimed at a company that is already running but hasn't invested in ops hygiene yet:

Domain and DNS monitoring. You are one expired card away from your entire business being offline. Start here.
SSL monitoring. Close second. An expired cert takes you down without warning.
Uptime monitoring on the primary site. Basic HTTP check, one-minute interval, alerts to someone who will see them.
Email authentication checks. If you send transactional or marketing email, SPF/DKIM/DMARC monitoring prevents silent deliverability cliffs.
Vendor status page subscriptions. Aggregate them somewhere you'll actually look.
A short runbook per alert. Even two lines. Explains what the alert means and what to do.
Backup verification. Actually restore a backup on a schedule you can point to.
Access review. Who has admin on what. Prune aggressively.
Quarterly cost audit. Every subscription you have, whether you still use it, what it costs.
A dependency diagram. Doesn't have to be fancy. Has to exist.

This sequence is inverted from what a lot of SMBs do, which tends to start with whatever's currently in crisis and never gets past step three. The trap is treating the crisis du jour as the next priority, rather than getting the base covered.

If you haven't done any of this, and you feel the urge to do all of it at once, don't. Do steps 1-3 this week. Do 4-5 next week. Spread the rest over a month. The goal is steady-state hygiene, not a project.

A related cultural point: most small businesses set up monitoring last, somewhere between "we got big enough" and "we got burned enough." The right time to set it up is before either of those. It's cheap, it's fast, and it compounds.

A simple IT ops checklist

If you read this entire guide and want to come away with one page, here it is. Tick the boxes you've actually done. Not "we talked about doing." Done.

Monitoring

[ ] Domain expiry monitoring on every domain you own
[ ] SSL certificate expiry monitoring on every cert
[ ] Uptime monitoring on primary customer-facing site
[ ] DNS change monitoring on your main domain
[ ] Email authentication checks (SPF, DKIM, DMARC)
[ ] Status page subscriptions for critical vendors

Alerting

[ ] Every alert goes to a specific person
[ ] Alerts have tiers (info / warn / page)
[ ] Every alert has a one-sentence runbook
[ ] Someone has reviewed the alert list in the last 90 days

Automation

[ ] SSL renewal is automated and monitored
[ ] Domain auto-renewal is on, with a check that verifies it happened
[ ] Backups run on a schedule
[ ] Backup restores are tested on a schedule

Vendor hygiene

[ ] Written list of all third-party dependencies
[ ] Each dependency has a known fallback or a known "no graceful fallback" note
[ ] Vendor review happens at least annually

People

[ ] More than one person knows how to read the alerts
[ ] There's a documented handoff process when someone leaves
[ ] Runbooks are in a place people can find them

Cost

[ ] Recurring spend is audited quarterly
[ ] No tool is on per-asset pricing you'll regret at 3x scale
[ ] Nothing is paid for that isn't in use

If you can tick most of these, you are running better IT ops than most companies at your stage. If you can't, the guides linked above are the deep dives on the sections that still have gaps.

None of this is glamorous. None of it is the kind of thing that shows up in a pitch deck. All of it pays for itself the first time something breaks silently in the middle of the night — which, if you've done this right, you'll find out about in the morning, from an alert that went to the right person, and that someone acted on before anyone outside the company noticed.

That's what good IT ops looks like at a small business. Boring, on purpose.

Start with the boring basics

Site Watcher monitors domains, SSL, uptime, DNS, and vendor status. Email Deliverability Checker covers SPF, DKIM, DMARC, BIMI, and blacklists. Both live today.

Try Site Watcher