Network Troubleshooting in Linux: A Beginner’s Step-by-Step Guide

Updated on
10 min read

Network issues can be among the most frustrating challenges for Linux users, system administrators, and developers. Whether you’re managing a server in a data center, a virtual machine, or a home lab router, having a systematic approach to troubleshooting is essential. In this beginner-friendly guide, you’ll learn how to identify and resolve network problems effectively using a structured workflow: gather information, form a hypothesis, test and isolate, implement a fix, and verify your solutions. We’ll cover practical commands, interpretation tips, and checks organized by OSI layers (link layer, IP layer, DNS, transport, and firewall) to help you pinpoint where the problem lies.

Important Notes Before You Begin

  • Perform testing on non-production systems or during maintenance windows when possible.
  • Many commands require elevated privileges; use sudo or operate from an admin account.
  • Use persistent sessions (e.g., tmux or screen) to prevent SSH session drops during remote troubleshooting.

Keep these principles in mind: reproduce the issue, change one aspect at a time, and document each step for easier rollback if necessary.


Linux Networking Basics

A quick primer on key networking concepts includes:

  • OSI Layers to Focus On: Link (Ethernet/Wi-Fi), Network (IP/routing), Transport (TCP/UDP), Application (DNS/HTTP/SSH).
  • Key Terms:
    • Interface: A network device (eth0, enp3s0, wlan0, docker0).
    • IP Address & Netmask: Identifies the host and subnet (e.g., 192.168.1.10/24).
    • Gateway: The destination for packets not local to your network.
    • DNS: Translates domain names to IP addresses.
    • ARP: Maps IP addresses to MAC addresses on the local network.
  • DHCP vs. Static: DHCP assigns IP addresses automatically, while static requires manual assignment.
  • Common Services That Change Runtime Config: NetworkManager, systemd-resolved, netplan (for Ubuntu), and systemd-networkd. These may overwrite your manual changes.

Note: The ip/iproute2 suite is the modern tool for Linux networking. Prefer this over legacy tools like ifconfig and route. For more details, refer to the man page for ip.


Prepare and Gather Information

Before making changes, establish a baseline and verify your access:

  • Confirm permissions: sudo/root and whether access is local or remote.
  • Accurately record the problem: include timestamps, failed commands, screenshots, and complete error messages.
  • Baseline Commands to Save:
# Interfaces and addresses
ip addr show
# Routes
ip route show
# DNS resolver status
resolvectl status || cat /etc/resolv.conf
# Listening sockets
ss -tuln
  • If troubleshooting remotely, start a tmux session: sudo apt install tmux && tmux.
  • Save outputs for future comparison: ip addr show > /tmp/ip-addr.before.txt.

Always document logs and screenshots before any changes.


Essential Commands

Here are the primary tools for troubleshooting, and when to use them:

  • ip (ip addr, ip link, ip route, ip neigh): Inspect interfaces, routes, and ARP. Look for interface status (UP/DOWN), addresses, and default via routes.
    # Show interfaces
    ip addr show
    # Show routing table
    ip route
    # Look up ARP neighbors
    ip neigh
    
  • ss (ss -tuln, ss -s) and netstat (legacy): Use ss for modern systems to show sockets and listening ports:
    # Show TCP/UDP listening ports
    ss -tuln
    # Summary
    ss -s
    
  • ping: Test basic reachability and latency. Distinguish DNS problems by pinging both IPs and hostnames:
    ping -c 4 8.8.8.8
    ping -c 4 example.com
    
  • traceroute / tracepath / mtr: Map the path packets take and analyze per-hop latency/loss. Use mtr for continuous monitoring:
    traceroute 8.8.8.8
    mtr --report example.com
    
  • dig / nslookup: Query DNS servers directly:
    # Query Google's DNS
    

dig @8.8.8.8 example.com +short

- **`arp / ip neigh`**: Inspect ARP cache to verify IP-MAC mappings and identify duplicates.
- **`nmcli / resolvectl / systemctl`**: Check NetworkManager, resolver, and service statuses:
```bash
nmcli device status
resolvectl status
systemctl status NetworkManager
  • ethtool: Diagnose NIC link status, speed, duplex, and offload settings:
    sudo ethtool eth0
    
  • iptables / nft / ufw / firewall-cmd: Inspect firewall rules. Modern systems utilize nftables; many distros still support iptables:
    sudo nft list ruleset # nftables
    sudo iptables -L -n -v  # legacy iptables
    
PurposeModern ToolLegacy ToolNotes
Interfaces & Addressesip (ip addr)ifconfigUse ip; ifconfig may be missing on minimal systems
Routesip routerouteip route shows policy routes too
Socketsssnetstatss is faster and more feature-rich

Step-by-Step Diagnostics by Layer

This section guides you through targeted checks to isolate issues by OSI layer:

  1. Link Layer (Physical/NIC)

    • Symptoms: Interface DOWN, NO-CARRIER, or frequent link flaps.
    • Commands:
    ip link show
    sudo ethtool enp3s0
    dmesg | grep -i eth
    
    • Checklist: Look for UP/DOWN flags, NO-CARRIER, driver errors in dmesg, and RX/TX errors.
    • For Wi-Fi, verify SSID, signal strength, and authentication logs.
  2. IP Layer (Addressing, Routes, ARP)

    • Verifying IP existence and subnet: ip addr show.
    • Check default route: ip route should display default via <gateway>.
    • Test gateway reachability:
    ping -c 3 <gateway-ip>
    

ip neigh show

- Address potential ARP issues: missing entries may indicate switch issues or duplicate IPs.

3. **DNS (Name Resolution vs. Connectivity)**
- Distinguish between DNS and connectivity: if `ping 8.8.8.8` succeeds but `ping example.com` fails, the issue is with DNS.
- Test DNS directly:
```bash
dig @8.8.8.8 example.com
resolvectl query example.com
  • Confirm configuration in /etc/resolv.conf or check systemd-resolved status.
  1. Routing and Path Checks

    • Use ip route get <dest> to see packet routing via the kernel.
    • Use traceroute or mtr to analyze where the path fails:
    ip route get 8.8.8.8
    traceroute 8.8.8.8
    
  2. Transport-Level Checks (Services and Ports)

    • Verify if the service is listening: ss -tuln.
    • Test port reachability from a client:
    nc -vz example.com 22
    
    • Inspect service logs (e.g., SSH: journalctl -u sshd -b).
  3. Firewall & Security Blocks

    • Examine firewall rules: nft list ruleset or iptables -L or ufw status.
    • Temporarily disable the firewall if deemed safe for testing:
    sudo ufw disable   # Test only, re-enable afterward
    
    • Consider security frameworks (AppArmor/SELinux) that may restrict networking.

Packet Capture and Analysis

If checks don’t pinpoint the issue, capture traffic using tcpdump:

  • Basic Commands:
# Capture traffic between local host and 1.2.3.4 on interface eth0, limit size and save
sudo tcpdump -i eth0 host 1.2.3.4 -s 96 -w /tmp/capture.pcap
# View a text summary
sudo tcpdump -r /tmp/capture.pcap -nn -tt
  • Filters can minimize disk usage: capture only necessary hosts or protocols. Use -s to limit the snapshot length.
  • Import captures into Wireshark for detailed analysis. Refer to Wireshark docs for guidance.

Common capture challenges include:

  • Repeated SYNs without SYN-ACK: Indicates the remote host may not be responding.
  • RST from remote: Suggests rejection of the connection.
  • DNS responses with NXDOMAIN or delays: Indicate potential DNS server issues.

Interpreting captures requires practice; start by focusing on SYNs, SYN-ACKs, RSTs, and ICMP messages.


Logs and System Services

Logs can often reveal the underlying problem. Key commands include:

# NetworkManager logs this boot
journalctl -u NetworkManager -b
# Kernel and driver messages
journalctl -k | tail -n 200
# General syslog (Debian/Ubuntu)
tail -f /var/log/syslog
# For RedHat/CentOS, check /var/log/messages

Look for DHCP failures, repeated link flaps, driver errors, and messages correlating with the observed outages.


Common Scenarios and Fixes

Frequent problems and practical solutions include:

  1. No Network on Machine (No IP)
    • Symptoms: No IP assigned.
    • Checks & Fixes:
      • Ensure the interface is UP: ip link show.
      • For DHCP, check service status: systemctl status dhclient and renew if necessary: sudo dhclient -v <iface>.
      • Assign a temporary static IP:
      sudo ip addr add 192.168.1.50/24 dev enp3s0
      sudo ip route add default via 192.168.1.1
      
  2. No Internet but LAN Works
    • Symptoms: Can connect to local devices but not external hosts.
    • Fixes:
      • Verify the default route: ip route.
      • Test the gateway: If successful, ping an external IP (8.8.8.8). If it fails, check router/NAT.
      • If you’re running a Linux router, check NAT rules: sudo nft list ruleset.
  3. DNS Resolving Failures
    • Symptoms: IP pings succeed, but hostnames do not.
    • Fixes:
      • Query a public DNS server: dig @8.8.8.8 example.com.
      • If the public query works, restart your resolver: sudo systemctl restart systemd-resolved and clear caches with: resolvectl flush-caches.
  4. Unable to Reach Remote Server (SSH/HTTP)
    • Symptoms: Connection times out or is refused.
    • Fixes:
      • Verify the service is running: ss -tuln | grep :22.
      • Test connectivity from another host within the same network.
      • Check firewall rules on both the server and any intermediate devices.
  5. Intermittent Connectivity and Packet Loss
    • Symptoms: Random drops or high latency.
    • Fixes:
      • Inspect NIC errors: ip -s link for RX/TX errors.
      • Identify duplex mismatches using ethtool.
      • Check Wi-Fi signal strength and interference.
      • Use mtr to pinpoint loss across hops.
  6. Slow Network Performance
    • Causes: Saturated link, mismatched MTU, packet loss.
    • Tests:
      • Measure throughput with iperf3: iperf3 -s on the server and iperf3 -c server on the client.
      • Identify high retransmissions with tcpdump.
      • Verify MTU settings with ip link show or adjust using:
      ip link set dev eth0 mtu 1400
      

Advanced Topics

  • Containers and Namespaces: Containers add virtual interfaces and networks. Use ip netns, docker network ls, and bridge commands for inspection.
  • VPNs / Overlays: Verify tunnel endpoints and routing policies.
  • WSL/VM Networking Quirks: If troubleshooting WSL, consult the WSL configuration guide.
  • When to Escalate: If issues trace beyond your network, gather evidence and contact your provider.

Also consider dependencies like LDAP for service access by consulting our LDAP integration guide.


Preventive Measures & Best Practices

  • Monitoring & Alerting: Implement uptime checks and service latency monitors.
  • Change Management: Document changes and use version control for config files.
  • Backups: Maintain backups of network configurations.
  • Security Hygiene: Enforce least privilege and ensure centralized logging.

Troubleshooting Checklist (Quick Reference)

Keep this checklist handy for rapid troubleshooting:

  1. Reproduce & record symptoms (timestamps, screenshots).
  2. Check physical/link status: ip link, ethtool.
  3. Verify IP & netmask: ip addr.
  4. Confirm default route & gateway reachability: ip route, ping <gateway>.
  5. Test external connectivity: ping 8.8.8.8.
  6. Test DNS functionality: dig @8.8.8.8 example.com.
  7. Check service ports: ss -tuln, nc -vz.
  8. Inspect firewall rules: nft list ruleset or iptables -L.
  9. Capture packets as needed: tcpdump -i <iface> host <ip> -w capture.pcap.
  10. Review logs: journalctl -u NetworkManager -b, journalctl -k.
  11. Document fixes and update runbooks.

(Consider converting this checklist into a printable PDF for operational runbooks.)


Conclusion

Network troubleshooting is a vital skill developed through practice and documentation. Start in a lab environment—set up VMs or a home lab to simulate issues and fixes. For further practice, simulate DNS failures or intentionally cause DHCP outages to enhance your skills.

For assistance with documenting incidents or creating post-mortem analyses, refer to our guide on creating technical presentations.


References and Further Reading

TBO Editorial

About the Author

TBO Editorial writes about the latest updates about products and services related to Technology, Business, Finance & Lifestyle. Do get in touch if you want to share any useful article with our community.