Skip to content

techone --guide=cloud-management

Cloud and server management: prevention, not firefighting

Long-term preventive maintenance for Azure, Hetzner, and own infrastructure. Monthly checks, backups, security patches. No black box. You see what is happening, what was fixed, and what is coming.

Last updated:

TL;DR

What we do
Monthly preventive maintenance of cloud and servers: security review, updates, backups with restore tests, monitoring, configuration check.
Why it works
Most outages are visible weeks in advance. Regular checks catch them in a planned window, not at 2 AM.
Who it fits
Companies with a production system and no internal DevOps team, who want predictable costs instead of firefighting.
Environments
Azure, Hetzner, hybrid and traditional on-premises. Same process, tools depend on the environment.
How to start
Free initial audit (1-2 weeks), written report with prioritized recommendations. Only then do you decide about cooperation.

Why prevention works better than reaction

Business IT can be managed in two ways. Both work, but one costs significantly more money, nerves, and clients.

Picture a Friday, 4:30 PM. The online store your company runs starts to slow down. At 4:45 the call center reports orders getting stuck. At 5:00 the database crashes. At 5:15 your IT says "this looks serious." And the biggest campaign of the quarter runs that weekend.

Eight hours in a war room follow. Restarts, manual scaling, adding indexes on the fly, calling key people. The outage lasts four hours. Lost revenue: hundreds of thousands. Reputation: days of damage control. Team: burned out.

The worst part? The retrospective shows the problem was visible months in advance. Growing load, missing indexes, nobody was watching. Nobody asked.

We catch problems while they are still small

Missing index, filling disk, expiring certificate, failing backups. These have weeks or months before they become an outage. Regular checks catch them in a planned window, not at 2 AM on a working day.

You never pay for firefighting

Emergency fixes at night and on weekends cost several times more than planned work. Preventive maintenance turns "unexpected crises" into predictable monthly expenses. Every CFO prefers that.

We continuously suggest improvements

Regular checks are not just "is everything running". They are a chance to see where you can cut cloud costs, where to improve performance, what to automate. We come with recommendations, not invoices for fixes.

Transparency instead of "everything runs"

A monthly written report showing what was checked, what was fixed, what is planned next. No black box. You know what you are paying for and where your money goes.

The difference between reactive and proactive IT is not in the technology. It is that someone is regularly looking. When a system goes down, the audit does not ask "why did this happen". It asks "what did you do to prevent it".

Reactive IT vs preventive maintenance

The difference between the two approaches is clearest when you compare concrete parameters. Both work, but they produce different outcomes.

Parameter Reactive IT Preventive maintenance
When problems get solved Only after an outage Weeks to months in advance, in a planned window
Typical response time Hours, often nights and weekends Planned, during business hours
Cost of a single fix Several times higher (emergency rate, overtime) Included in monthly retainer
Cost predictability Low, crises come unexpectedly High, fixed monthly fee
Audit documentation Usually missing or fragmented Monthly written report, auditable
Impact on the team Stress, burnout, resignations Calm operations, room to build
Client relationships Apologies for outages, lost trust No outages, no apologies

swipe to see the full table

What we actually do every month

Preventive maintenance is not a one-time check. It is a regular process with seven core areas we go through with every client. Specific outputs vary by environment. An Azure application looks different from a dedicated server, but the areas are the same.

1

Security review

Access rights, users, network endpoints. Who has access to what, whether firewalls are in order, whether anyone's credentials have expired. We remove access that is no longer needed. Often the biggest security gap is not what you add, but what you forget to remove.

2

Updates and patches

Versions of individual components: database, runtime, libraries, operating system. Security patches monthly, service packs quarterly. What is pending, what is outdated, what the vendor no longer supports. Less dramatic than letting the system age until it breaks.

3

Backups and restore tests

Checking that backups run on schedule and that they can actually be restored. The biggest surprise with backups is that the backup exists but the restore does not work. We test restoration at least once a quarter, more often for critical systems. Retention policies, off-site copies, transaction logs.

4

Monitoring and log review

Graphs of performance, storage, latency, and error rates for the past month. We look for anomalies: something that was fine last time and is starting to grow. Exceptions in application logs, unexpected restarts, slow queries. Problems are often visible weeks before they become incidents.

5

Environment configuration check

Whether the cloud or server matches the documentation, whether anyone made a quick config change and forgot to document it, whether costs are creeping up because nobody is watching. This is where we often find the biggest savings: forgotten resources, oversized instances, unused databases.

6

Review of last month's changes

What changed, who changed it, why. Documentation and changelog updates. This sounds mundane, but it matters during audits and when handing the system over to a new person. A system nobody documented depends entirely on one person's memory.

7

Planning for next month

What is coming: known campaigns, new releases, seasonal load, upcoming migrations. What we propose for the next check. This is not only "what we will do", but also "what the client should prepare". For example, approvals for larger server purchases, or maintenance windows to plan ahead.

The technical specifics of each area depend on the environment. For Azure applications we track different metrics than for dedicated servers. For SaaS platforms different from internal ERPs.

How cooperation works

Preventive maintenance starts with understanding what you are managing. No one-size-fits-all. A small Azure application needs a different approach than a multi-country e-commerce with dedicated infrastructure. If you are still deciding whether and how to move to the cloud in the first place, our cloud migration guide covers that. Preventive maintenance comes into play afterwards.

1

Initial audit

We go through your environment, identify what is in order, what needs attention, and what is missing entirely. The output is a written report with prioritized recommendations and a proposed scope of preventive maintenance. This phase is free. You pay only for actual work. Typically takes 1 to 2 weeks.

2

Report scope design

We define what the monthly report will cover. For one client compliance is critical, for another capacity planning, for a third cost optimization. We decide what to track, which metrics matter, how quickly we respond to alerts. Scope can be adjusted as your environment evolves.

3

Monthly reports and actions

Each month we go through the defined areas, send a written report, and make needed adjustments. Larger changes we discuss upfront, smaller ones (like applying patches) we handle directly.

4

Continuous improvement

The scope is not static. As your environment changes, we adjust it. As we learn new things, we add them to the checks. The goal is not to maintain the status quo, but to gradually improve.

This is a long-term relationship, not a one-off engagement. Most clients stay with us for years.

What prevention caught in practice

Theory sounds good, but the real value of prevention only shows in concrete results. Here are a few real situations from recent months where a routine check caught a problem before it had a chance to become an outage. We do not name clients, but the scenarios are authentic.

1. Missing index, two weeks before outage

Database monitoring showed average query time growing week over week. Log analysis revealed a table that had been growing by thousands of rows daily since the last release, and a critical query was scanning the whole thing. Adding the index took 10 minutes in a planned window. Without it, the system would have gone down once the table reached about 500,000 rows.

2. Backups existed, restore did not work

For a new client, we ran a restore test as part of the initial audit. The backup looked fine, but the restored file was half the expected size. It turned out that six months earlier someone had changed the backup destination path, and only part of the database had been backed up since. Nobody knew, because nobody had verified it. We fixed the path and introduced monthly restore tests.

3. Former employee account, active for two years

During an access review, we found an admin account belonging to someone who had left the company two years earlier. They had full production rights, access to backups and to the database. Nobody had noticed because the departure had been handled by HR only, not by IT. We disabled the account immediately and introduced a process linked to offboarding.

4. Certificate about to expire over the weekend

A monthly check showed that the SSL certificate for the main domain would expire in 12 days, on a Saturday. Auto-renewal was configured, but a recent web server update had changed the path to the renew script and nobody had noticed. We fixed it during business hours. Without that, visitors on Saturday morning would have hit an invalid certificate error.

All these situations had one thing in common: they were visible in advance, but nobody was looking. Prevention does not rely on magic, only on someone regularly going through the critical points and noticing what is changing.

Who this is for

Preventive maintenance makes sense for some companies and not for others. And we say so upfront.

Good fit

  • You have a production system your business depends on
  • You have no internal DevOps team or it is overloaded
  • You want predictable costs instead of surprises
  • You are planning growth or expansion to new markets
  • You need documentation and evidence for audit
  • You are looking for a long-term partner, not a one-off

Not a fit

  • You have an internal DevOps team covering your stack
  • The project is not yet in production and has no users
  • You are shopping for the cheapest offer
  • A yearly audit is enough for you, this is about regularity
  • You expect work without documentation or access

Frequently Asked Questions

How much does it cost?

Price depends on three factors: size of your environment (number of servers, databases, applications), scope of the report (what you want us to track), and required response time (business hours vs 24/7 SLA). Typical ranges: a small Azure application with one database costs significantly less than a multi-country e-commerce with own infrastructure. We prepare a specific quote after the initial audit, which is free. You only pay for actual work based on an approved proposal.

What is included in the monthly package?

Monitoring and alerting, incident response within SLA, regular updates and security patches, and a monthly report with findings and recommendations. The scope is tailored to what you actually need.

How quickly do you respond to an outage?

It depends on the SLA we agree on. The standard is within 30 minutes during business hours, and within 15 minutes 24/7 for critical systems. Monitoring runs continuously, so we respond to most problems before you even notice them.

Do you support only cloud, or own servers too?

Both. We manage cloud (Azure primarily, but others too), dedicated servers (Hetzner and similar), hybrid environments (cloud plus own servers), and traditional on-premises infrastructure. The process is the same, only the tools differ by environment.

Can you handle smaller infrastructure?

Yes. You do not need dozens of servers. We work with clients who run a single server and a single database. The scope adjusts to reality.

Our company is growing. How should we scale infrastructure?

Two paths: move to the cloud (pay for actual consumption, scale as needed), or containerize applications and orchestrate them (Docker, Kubernetes) on dedicated servers. We combine both approaches depending on the situation. We start with an audit and propose a plan.

Can we start with just an audit, without committing long term?

Yes, the audit is a standalone product. We walk through your environment, analyze it and send you a written report with specific things to fix. Whether it turns into a long-term collaboration is your decision. The audit is not a sales pitch, it is a real analysis.

How is this different from classic IT support?

Classic IT support is reactive: you call when something breaks. This is proactive: we call (or write) when we see a problem coming. Incident response is still part of the service, but the goal is to have as few incidents as possible.

Looking for a partner for cloud management?

Tell us what you run. We will walk through your environment and propose how to manage it.

Free consultation