Linux & shell

Instead of trivia-heavy questions, focus on scenarios that actually happen in production. These are the kinds of Linux questions platform, DevOps, SRE, and backend interviews often revolve around.

Process & system debugging

1. A server suddenly becomes slow. What do you check first?

What interviewers want:

Structured debugging
Understanding of CPU, memory, disk, and load

Good areas to mention:

top / htop
vmstat
iostat
load average vs CPU usage
OOM kills
disk saturation

2. What’s the difference between `kill`, `kill -15`, and `kill -9`?

Expected understanding:

Graceful shutdown vs forceful termination
Cleanup handlers
Zombie/orphan processes

Bonus: Explain why SIGKILL can leave corrupted temp files or broken locks.

3. A process keeps restarting automatically after you kill it. Why?

Natural follow-ups:

systemd
Kubernetes restart policy
supervisor
cron watchdogs

Commands:

systemctl status myapp
ps auxf

4. How would you identify which process is consuming the most memory?

Useful commands:

top
ps aux --sort=-%mem | head
smem

Follow-up: What’s the difference between RSS, VSZ, and cached memory?

Networking & connectivity

5. Your app cannot reach another service. How do you debug it?

A practical answer usually includes:

DNS resolution
Network connectivity
Listening ports
Firewall/security groups
TLS/authentication

Commands:

dig api.example.com
curl -v https://api.example.com
ss -tlnp
ping
traceroute

6. How do you check which process is listening on port 8080?

ss -tlnp | grep 8080

Alternative:

lsof -i :8080

A strong practical answer:

TCP connection
Key exchange
Host verification
Public/private key auth
authorized_keys
session creation

Bonus: Mention why agent forwarding can be risky.

Filesystem & storage

8. Disk usage suddenly jumped to 100%. How do you investigate?

Expected flow:

df -h
du -xh / | sort -h | tail
find /var -type f -size +1G

Good real-world causes:

logs
core dumps
runaway temp files
Docker images
deleted-but-open files

Bonus command:

lsof | grep deleted

9. What’s the difference between a hard link and a soft link?

Practical explanation:

Hard link → same inode
Soft link → pointer/path reference

Important behavior:

Soft links break if target moves
Hard links survive original filename deletion

10. Why might `rm` not immediately free disk space?

Expected concept: Process still holding file descriptor open.

Debugging:

lsof | grep deleted

Permissions & security

11. Explain Linux file permissions using `755`.

Expected:

owner/group/others
read/write/execute

Bonus: Difference between directory execute bit and file execute bit.

12. When would you use `chmod` vs `chown`?

Simple but surprisingly common.

13. A script works with `sudo` but fails without it. How do you debug?

Possible areas:

permissions
environment variables
PATH differences
SELinux/AppArmor
file ownership

Logs & observability

14. How do you continuously monitor logs in Linux?

tail -f app.log
journalctl -u myapp -f

Follow-up: Difference between application logs and systemd journal logs.

15. An application crashes randomly. What logs would you inspect?

Good answer:

app logs
kernel logs
OOM messages
systemd logs

Commands:

dmesg
journalctl -xe

Systemd & services

16. How do you start a service automatically on boot?

systemctl enable nginx

Good follow-up — explain:

Restart=always
After=network.target
WantedBy=multi-user.target

17. A systemd service is failing repeatedly. How do you debug it?

Useful commands:

systemctl status myapp
journalctl -u myapp

Bonus: Discuss restart loops and exit codes.

Bash & scripting

18. What does `set -euo pipefail` do in bash?

This is asked a lot in DevOps interviews.

Expected:

fail fast
undefined variables
pipeline failure handling

19. What does `$?` represent?

Exit code of previous command.

Bonus convention:

0 = success
non-zero = failure

20. Write a small bash loop to process files in a directory.

Example:

for file in *.log; do
  echo "Processing $file"
done

Interviewers usually care more about readability than cleverness.

Scenario-based questions (most valuable)

These feel closest to actual production work.

21. One Kubernetes pod is failing, but others are healthy. How would you investigate?

Expected thought process:

logs
resource limits
node issue
config drift
dependency connectivity

22. CPU usage is low, but load average is very high. Why?

Strong candidates mention:

blocked I/O
uninterruptible sleep
disk/network bottlenecks

23. A deployment happened 10 minutes ago and now users see 5xx errors. What do you do?

Interviewers want:

rollback thinking
blast radius assessment
metrics/logs
mitigation before deep debugging

24. How would you safely restart a production service?

Good practical answer:

health checks
draining traffic
rolling restart
verify metrics after restart

25. You can SSH into a server, but outbound internet access doesn’t work. What could be wrong?

Potential causes:

DNS
routing
NAT gateway
firewall rules
proxy configuration

What interviewers are really testing

Most Linux interviews are not about memorizing commands. They are about demonstrating how you think during failures:

Can you narrow down scope?

Can you gather evidence quickly?

Do you understand how systems behave under stress?

Can you debug calmly and methodically?

Strong candidates explain why they run a command, not just the command itself.

Process & system debugging

1. A server suddenly becomes slow. What do you check first?

2. What’s the difference between kill, kill -15, and kill -9?

3. A process keeps restarting automatically after you kill it. Why?

4. How would you identify which process is consuming the most memory?

Networking & connectivity

5. Your app cannot reach another service. How do you debug it?

6. How do you check which process is listening on port 8080?

7. What happens during an SSH login?

Filesystem & storage

8. Disk usage suddenly jumped to 100%. How do you investigate?

9. What’s the difference between a hard link and a soft link?

10. Why might rm not immediately free disk space?

Permissions & security

11. Explain Linux file permissions using 755.

12. When would you use chmod vs chown?

13. A script works with sudo but fails without it. How do you debug?

Logs & observability

14. How do you continuously monitor logs in Linux?

15. An application crashes randomly. What logs would you inspect?

Systemd & services

16. How do you start a service automatically on boot?

17. A systemd service is failing repeatedly. How do you debug it?

Bash & scripting

18. What does set -euo pipefail do in bash?

19. What does $? represent?

20. Write a small bash loop to process files in a directory.

Scenario-based questions (most valuable)

21. One Kubernetes pod is failing, but others are healthy. How would you investigate?

22. CPU usage is low, but load average is very high. Why?

23. A deployment happened 10 minutes ago and now users see 5xx errors. What do you do?

24. How would you safely restart a production service?

25. You can SSH into a server, but outbound internet access doesn’t work. What could be wrong?

What interviewers are really testing

2. What’s the difference between `kill`, `kill -15`, and `kill -9`?

10. Why might `rm` not immediately free disk space?

11. Explain Linux file permissions using `755`.

12. When would you use `chmod` vs `chown`?

13. A script works with `sudo` but fails without it. How do you debug?

18. What does `set -euo pipefail` do in bash?

19. What does `$?` represent?