Guides/LinuxLinux/Service Management With systemd

Service Management With systemd

Run your app as a real service that starts on boot, restarts on crash, and logs where you can find it. systemctl and journalctl, writing a unit file, restart policies, dependencies, and how to debug a service that will not start.


On almost every modern Linux server - Ubuntu, Debian, RHEL, Amazon Linux - systemd is what starts, stops, and supervises everything from sshd to your own app. If you can write a unit file and read its logs, you can deploy a service that survives reboots and crashes without reaching for nohup, a hand-rolled init script, or a process manager bolted on top.

This is the working subset: the commands you run daily, how to turn your program into a managed service, and how to debug one that will not start.

The model: units and systemctl

systemd manages units. The kind you will touch most is a service (a .service unit), but there are also sockets, timers, mounts, and targets. systemctl is the one command you drive it all with.

systemctl status nginx       # is it running? recent logs, PID, memory
systemctl start nginx        # start it now
systemctl stop nginx         # stop it now
systemctl restart nginx      # stop then start
systemctl reload nginx       # re-read config WITHOUT dropping connections

reload is not the same as restart. restart kills the process and starts a new one (a brief outage); reload tells the running process to re-read its config in place, which well-behaved daemons like nginx do without dropping a single request. Use reload for a config change when the service supports it.

The single most useful command is status - it shows whether the unit is active, its main PID, memory use, and the last several log lines, which is usually enough to see what is wrong.

Start now vs start on boot: enable vs disable

This trips up everyone once: starting a service and enabling it are two different things.

  • systemctl start app - runs it right now, this boot only.
  • systemctl enable app - makes it start automatically on every boot. It does not start it now.

You almost always want both, so use the shortcut:

systemctl enable --now app   # enable on boot AND start it immediately
systemctl disable --now app  # stop it now AND stop it starting on boot

A service you started but forgot to enable works perfectly - until the box reboots and it silently does not come back. After setting up any service, confirm with systemctl is-enabled app.

Reading the logs: journalctl

systemd captures the stdout and stderr of every service into the journal. No more hunting through /var/log guessing at filenames - the logs are queryable in one place.

journalctl -u app                 # all logs for the "app" unit
journalctl -u app -e              # jump to the end (most recent)
journalctl -u app -f              # follow live, like tail -f
journalctl -u app --since "10 min ago"
journalctl -u app -p err          # only error priority and worse
journalctl -u app -b              # only this boot

The combination you will use constantly when a service is broken: journalctl -u app -e to see how it died, then journalctl -u app -f while you try to start it again so you watch the failure happen live.

If the journal is filling the disk, it is bounded by config (/etc/systemd/journald.conf) and you can trim it: journalctl --vacuum-time=7d keeps the last week.

Writing your own service

Here is the whole point: turning your program into a managed service. Create a unit file at /etc/systemd/system/myapp.service:

[Unit]
Description=My App API
After=network.target

[Service]
Type=simple
User=appuser
WorkingDirectory=/opt/myapp
ExecStart=/opt/myapp/venv/bin/python server.py
Restart=on-failure
RestartSec=5
Environment=PORT=8080
EnvironmentFile=/opt/myapp/.env

[Install]
WantedBy=multi-user.target

Then load and enable it:

systemctl daemon-reload          # tell systemd to re-read unit files
systemctl enable --now myapp     # start it and set it to start on boot
systemctl status myapp           # confirm it is running

That is a production-grade service: it runs as a non-root user, restarts if it crashes, picks up environment from a file, and comes back after a reboot. Walking through the three sections:

[Unit] - identity and ordering

  • Description - what shows up in status and logs.
  • After - ordering. After=network.target means "do not start until networking is up." Ordering only; it does not require the other unit.

[Service] - how to run it

  • Type - simple (the default) means your ExecStart process is the service itself. Use notify if your app signals systemd when it is truly ready, or forking for old-style daemons that background themselves. When in doubt, simple.
  • User - run as a dedicated, unprivileged user, not root. This is the single most important security line in the file.
  • WorkingDirectory - the directory the process starts in.
  • ExecStart - the exact command. Use absolute paths; systemd does not run a login shell, so python or node on a bare PATH will often fail. Point at the full binary.
  • Restart - the supervision policy. on-failure restarts only on a non-zero exit or a signal; always restarts even on a clean exit. RestartSec adds a delay so a crash-looping service does not hammer the CPU.
  • Environment / EnvironmentFile - inject config. Keep secrets in an EnvironmentFile with tight permissions (chmod 600), not inline in the unit.

[Install] - what enable hooks into

  • WantedBy=multi-user.target - the standard "normal multi-user system" state. This is what makes enable start the service on boot. Almost every service you write uses this exact line.

The rule you will forget once: daemon-reload

systemd does not watch unit files for changes. Every time you edit a .service file, you must run:

systemctl daemon-reload

before your change takes effect. Skip it and you will edit the file, restart the service, and watch it run the old config - one of the most common "why is my change not working" moments with systemd. Edit the unit, daemon-reload, then restart.

Restart policies in practice

Restart=on-failure is the right default for most services, but understand the knobs around it:

Restart=on-failure
RestartSec=5                 # wait 5s between restart attempts
StartLimitIntervalSec=60     # within this window...
StartLimitBurst=5            # ...allow at most 5 restarts, then give up

The start limit matters: without it, a service that fails instantly will restart forever in a tight loop. With it, systemd tries a few times and then stops, leaving the unit in a failed state so you actually notice. If a service is stuck refusing to start because it hit the limit, systemctl reset-failed app clears the counter.

Dependencies and ordering

Two different things people conflate:

  • Ordering (After=, Before=) - when to start relative to another unit. Does not pull the other unit in.
  • Requirements (Wants=, Requires=) - whether to start another unit. Wants= is a soft dependency (start it too, but do not fail if it fails); Requires= is hard (if the dependency fails, fail this unit too).

A common pattern: an app that needs a database on the same box.

[Unit]
After=postgresql.service
Wants=postgresql.service

Prefer Wants= over Requires= unless your service genuinely cannot function without the other - hard requirements create failure cascades that are hard to debug.

Scheduled jobs: systemd timers

systemd timers are the modern alternative to cron, and they integrate with everything above - the job runs as a service, logs to the journal, and you debug it the same way. A timer is two units: a .service that does the work and a .timer that triggers it.

# backup.timer
[Unit]
Description=Nightly backup

[Timer]
OnCalendar=*-*-* 02:30:00     # every day at 02:30
Persistent=true              # run on next boot if the machine was off

[Install]
WantedBy=timers.target
systemctl enable --now backup.timer
systemctl list-timers              # see all timers and their next run

The wins over cron: journalctl -u backup gives you the job's output (cron emails it into the void), and Persistent=true catches up a missed run after downtime. For anything non-trivial, prefer a timer.

Debugging a service that will not start

The reliable loop when systemctl start app fails:

  1. systemctl status app - read the bottom lines. It usually shows the exit code and the last few log lines.
  2. journalctl -u app -e - the full recent logs. The actual error is almost always here: a missing file, a bad path, a port in use, a permission error.
  3. Check the usual suspects: is the ExecStart path absolute and correct? Does the User have permission to the WorkingDirectory and files (see the file permissions guide)? Did you daemon-reload after editing?
  4. Run the ExecStart command by hand as that user to see the raw error: sudo -u appuser /opt/myapp/venv/bin/python server.py. If it fails here too, the problem is your app, not systemd.
  5. systemctl reset-failed app if it hit the start limit, fix the cause, then start again.

Ninety percent of "the service will not start" cases are one of: a non-absolute path in ExecStart, the wrong User lacking permissions, a forgotten daemon-reload, or an environment variable the app needs that the unit did not provide. Check those four first.

One more: masking

Beyond disable, you can mask a unit to make it completely unstartable - even as a dependency of something else. Useful when you want to be certain a service can never come up:

systemctl mask apache2       # symlink it to /dev/null; cannot be started
systemctl unmask apache2     # reverse it

Reach for mask when disable is not enough - for example, stopping a distro's bundled service from being pulled in by another package. For everyday "I do not want this on boot," plain disable is the right tool.

The shape of it

Day to day, systemd is a small number of verbs: status to check, start/stop/restart/reload to control, enable/disable for boot, and journalctl -u for logs. Writing a service is one unit file with three sections and a daemon-reload. Get comfortable deploying one real app as a unit - non-root user, Restart=on-failure, logs in the journal - and you have the pattern for every service you will ever run.