Core skills

Linux & shell

Instead of trivia-heavy questions, focus on scenarios that actually happen in production. These are the kinds of Linux questions platform, DevOps, SRE, and backend interviews often revolve around.

Process & system debugging

1. A server suddenly becomes slow. What do you check first?

What interviewers want:

  • Structured debugging
  • Understanding of CPU, memory, disk, and load

Good areas to mention:

  • top / htop
  • vmstat
  • iostat
  • load average vs CPU usage
  • OOM kills
  • disk saturation

2. What’s the difference between kill, kill -15, and kill -9?

Expected understanding:

  • Graceful shutdown vs forceful termination
  • Cleanup handlers
  • Zombie/orphan processes

Bonus: Explain why SIGKILL can leave corrupted temp files or broken locks.

3. A process keeps restarting automatically after you kill it. Why?

Natural follow-ups:

  • systemd
  • Kubernetes restart policy
  • supervisor
  • cron watchdogs

Commands:

systemctl status myapp
ps auxf

4. How would you identify which process is consuming the most memory?

Useful commands:

top
ps aux --sort=-%mem | head
smem

Follow-up: What’s the difference between RSS, VSZ, and cached memory?

Networking & connectivity

5. Your app cannot reach another service. How do you debug it?

A practical answer usually includes:

  1. DNS resolution
  2. Network connectivity
  3. Listening ports
  4. Firewall/security groups
  5. TLS/authentication

Commands:

dig api.example.com
curl -v https://api.example.com
ss -tlnp
ping
traceroute

6. How do you check which process is listening on port 8080?

ss -tlnp | grep 8080

Alternative:

lsof -i :8080

7. What happens during an SSH login?

A strong practical answer:

  • TCP connection
  • Key exchange
  • Host verification
  • Public/private key auth
  • authorized_keys
  • session creation

Bonus: Mention why agent forwarding can be risky.

Filesystem & storage

8. Disk usage suddenly jumped to 100%. How do you investigate?

Expected flow:

df -h
du -xh / | sort -h | tail
find /var -type f -size +1G

Good real-world causes:

  • logs
  • core dumps
  • runaway temp files
  • Docker images
  • deleted-but-open files

Bonus command:

lsof | grep deleted

Practical explanation:

  • Hard link → same inode
  • Soft link → pointer/path reference

Important behavior:

  • Soft links break if target moves
  • Hard links survive original filename deletion

10. Why might rm not immediately free disk space?

Expected concept: Process still holding file descriptor open.

Debugging:

lsof | grep deleted

Permissions & security

11. Explain Linux file permissions using 755.

Expected:

  • owner/group/others
  • read/write/execute

Bonus: Difference between directory execute bit and file execute bit.

12. When would you use chmod vs chown?

Simple but surprisingly common.

13. A script works with sudo but fails without it. How do you debug?

Possible areas:

  • permissions
  • environment variables
  • PATH differences
  • SELinux/AppArmor
  • file ownership

Logs & observability

14. How do you continuously monitor logs in Linux?

tail -f app.log
journalctl -u myapp -f

Follow-up: Difference between application logs and systemd journal logs.

15. An application crashes randomly. What logs would you inspect?

Good answer:

  • app logs
  • kernel logs
  • OOM messages
  • systemd logs

Commands:

dmesg
journalctl -xe

Systemd & services

16. How do you start a service automatically on boot?

systemctl enable nginx

Good follow-up — explain:

Restart=always
After=network.target
WantedBy=multi-user.target

17. A systemd service is failing repeatedly. How do you debug it?

Useful commands:

systemctl status myapp
journalctl -u myapp

Bonus: Discuss restart loops and exit codes.

Bash & scripting

18. What does set -euo pipefail do in bash?

This is asked a lot in DevOps interviews.

Expected:

  • fail fast
  • undefined variables
  • pipeline failure handling

19. What does $? represent?

Exit code of previous command.

Bonus convention:

  • 0 = success
  • non-zero = failure

20. Write a small bash loop to process files in a directory.

Example:

for file in *.log; do
  echo "Processing $file"
done

Interviewers usually care more about readability than cleverness.

Scenario-based questions (most valuable)

These feel closest to actual production work.

21. One Kubernetes pod is failing, but others are healthy. How would you investigate?

Expected thought process:

  • logs
  • resource limits
  • node issue
  • config drift
  • dependency connectivity

22. CPU usage is low, but load average is very high. Why?

Strong candidates mention:

  • blocked I/O
  • uninterruptible sleep
  • disk/network bottlenecks

23. A deployment happened 10 minutes ago and now users see 5xx errors. What do you do?

Interviewers want:

  • rollback thinking
  • blast radius assessment
  • metrics/logs
  • mitigation before deep debugging

24. How would you safely restart a production service?

Good practical answer:

  • health checks
  • draining traffic
  • rolling restart
  • verify metrics after restart

25. You can SSH into a server, but outbound internet access doesn’t work. What could be wrong?

Potential causes:

  • DNS
  • routing
  • NAT gateway
  • firewall rules
  • proxy configuration

What interviewers are really testing

Most Linux interviews are not about memorizing commands. They are about demonstrating how you think during failures:

  • Can you narrow down scope?
  • Can you gather evidence quickly?
  • Do you understand how systems behave under stress?
  • Can you debug calmly and methodically?

Strong candidates explain why they run a command, not just the command itself.

← All topics Browse jobs