1 β Quick triage (1 minute)
curl -sS -I http://localhost:8080/server/api
ss -ltnp | grep -E ':8983|LISTEN'
sudo systemctl status tomcat10 --no-pager
sudo tail -n 40 /var/log/apache2/error.log
2 β Full recovery procedure
A. Stop leftover Solr instances
sudo -u solr /opt/solr/bin/solr stop -all
B. Start Solr
sudo systemctl start solr
C. Restart Tomcat10
sudo systemctl restart tomcat10
Expanded Recovery & Runbook β DSpace (Apache β Tomcat10 β Solr β Postgres)
Below is a compact, copy-pasteable runbook you can follow the next time the site shows 500s or Apache logs error reading status line from remote server localhost:8080. It contains (A) a one-minute triage, (B) the full step-by-step recovery, (C) diagnostics to capture, (D) a safe restart script you can install, and (E) forward-looking prevention steps.
All commands assume you have sudo access (run as root or prefix with sudo).
1 β Quick triage (1 minute)
Run these four commands and share the failures if anything looks wrong:
# 1) Does the frontend/API respond?
curl -sS -I http://localhost:8080/server/api || echo "no response on :8080"
# 2) Is Solr listening?
ss -ltnp | grep -E ':8983|LISTEN' || sudo lsof -i:8983
# 3) Tomcat service status
sudo systemctl status tomcat10 --no-pager
# 4) Recent Apache error messages
sudo tail -n 40 /var/log/apache2/error.log
If curl fails and/or Tomcat is inactive or Solr not listening β follow the full recovery below.
2 β Full step-by-step recovery (safe order)
High level rule: Start services in this order: Solr β Tomcat β Apache (Apache may stay running; backend needs to be healthy before it proxies).
A. Stop risky/manual leftovers
Avoid mixed methods (systemctl + manual /opt/solr/bin/solr start) β stop stray manual instances first.
# Stop any manual Solr run (safe, by solr user)
sudo -u solr /opt/solr/bin/solr stop -all 2>/dev/null || true
# Ensure no java/solr processes are left for port 8983
sudo lsof -i :8983 || ss -ltnp | grep 8983 || true
B. Start Solr (use systemd if available)
Prefer systemctl for reliability:
sudo systemctl start solr
sudo systemctl status solr --no-pager
# Wait & check health
for i in {1..12}; do curl -sSf http://localhost:8983/solr/ && break || sleep 5; done
If systemctl start solr fails with port 8983 already in use, run:
sudo lsof -i :8983 -Pn
# note pid -> if it's a stale solr process owned by solr, stop it:
sudo -u solr /opt/solr/bin/solr stop -all
# If an unrelated process is on 8983, investigate that PID and free the port
ps -fp <pid>
C. Start Tomcat 10 (only after Solr healthy)
sudo systemctl restart tomcat10
sudo systemctl status tomcat10 --no-pager
# tail Tomcat logs while it starts
sudo tail -n 200 /var/log/tomcat10/catalina.out
D. Verify DSpace REST API
Wait for /server/api to answer:
for i in {1..18}; do
http_status=$(curl -s -o /dev/null -w '%{http_code}' http://localhost:8080/server/api || echo 000)
echo "$(date '+%F %T') status=$http_status"
[ "$http_status" -ge 200 -a "$http_status" -lt 600 ] && break
sleep 5
done
Expect 200 (or JSON body). If you get 500 or 000, check Tomcat/Dspace logs next.
E. Check Apache proxy (if backend healthy but browser still shows 500)
Validate Apache config and reload:
sudo apachectl -t
sudo systemctl reload apache2
# tail Apache error log
sudo tail -n 80 /var/log/apache2/error.log
3 β Diagnostic commands & log locations (collect these when reporting)
Collect and save these snippets whenever you open an incident β they show the most useful info.
# Processes & ports
ps -ef | egrep 'tomcat|java|solr'
ss -ltnp | egrep ':8080|:8983|:80|:443'
# Systemd status
sudo systemctl status solr tomcat10 apache2 --no-pager
# Key logs (tail last 200 lines)
sudo tail -n 200 /dspace/log/dspace.log
sudo tail -n 200 /var/log/tomcat10/catalina.out
sudo journalctl -u solr -n 200 --no-pager
sudo tail -n 200 /var/log/apache2/error.log
# Check disk, memory & OOM
df -h
free -m
dmesg | egrep -i 'killed process|oom|out of memory' || true
# Search for Exceptions quickly
sudo grep -iR "Exception" /dspace/log /var/log/tomcat10 2>/dev/null | tail -n 50
Save these outputs to a file and attach to a ticket if needed.
4 β Common failure patterns & fixes
A. Solr fails to start β port already in use
Cause: manual Solr process running or another app occupying 8983. Fix:
- Identify PID (
sudo lsof -i :8983) - If it’s Solr started manually, stop it via
/opt/solr/bin/solr stop -all - Use
systemctl start solrafter cleaning up
B. Tomcat crashes / DSpace logs show startup exceptions
Cause: misconfiguration, missing DB, OOM, or incomplete deployment. Fix:
-
Inspect
/var/log/tomcat10/catalina.outand/dspace/log/dspace.logfor stacktrace -
Common quick checks:
- DB reachable (
psqlconnectivity from server) - Disk full (
df -h) - JVM OOM (check
dmesg)
- DB reachable (
-
Redeploy war if missing:
ls -l /var/lib/tomcat10/webapps/dspace.war # if missing, copy war and restart tomcat: sudo cp /path/to/dspace.war /var/lib/tomcat10/webapps/ sudo systemctl restart tomcat10
C. Apache reverse proxy errors error reading status line from remote server
Cause: backend closed connection (Tomcat crashed or refused). Fix:
- Confirm Tomcat is alive and responding on 8080 using
curllocally - Check Apache Proxy settings and ProxyTimeout, then reload Apache
D. Sudden outages after editing submission-forms.xml
If you edited DSpace config files, always restart Solr (if config affects indexing) and Tomcat in the correct order and check logs for parsing errors. If a bad XML causes the app to throw at startup, revert to backup.
5 β Recommended safe restart script
Create /usr/local/bin/dspace-safe-restart.sh and make executable. This performs a safe stop/start and health checks.
#!/bin/bash
set -euo pipefail
LOG="/var/log/dspace-restart-$(date +%F_%H%M%S).log"
exec > >(tee -a "$LOG") 2>&1
echo "===== dspace-safe-restart started: $(date) ====="
echo "--- 1) Check current statuses ---"
systemctl is-active --quiet solr && echo "solr: active" || echo "solr: inactive"
systemctl is-active --quiet tomcat10 && echo "tomcat10: active" || echo "tomcat10: inactive"
echo "--- 2) Ensure no stray solr instance on 8983 ---"
if sudo lsof -i :8983 -Pn -sTCP:LISTEN >/dev/null 2>&1; then
echo "Port 8983 in use by:"
sudo lsof -i :8983 -Pn -sTCP:LISTEN
echo "Attempting to stop solr via systemctl..."
sudo systemctl stop solr || true
sleep 3
sudo -u solr /opt/solr/bin/solr stop -all || true
fi
echo "--- 3) Start Solr (systemd) ---"
sudo systemctl start solr
sleep 3
if ! curl -sSf http://localhost:8983/solr/ >/dev/null; then
echo "Solr not healthy after start; aborting."
exit 1
fi
echo "Solr healthy."
echo "--- 4) Restart tomcat10 ---"
sudo systemctl restart tomcat10
sleep 5
echo "--- 5) Wait for DSpace API ---"
for i in {1..18}; do
status=$(curl -s -o /dev/null -w '%{http_code}' http://localhost:8080/server/api || echo 000)
echo "$(date '+%F %T') HTTP $status"
if [[ "$status" =~ ^2|3 ]]; then
echo "API OK."
break
fi
sleep 5
done
echo "===== dspace-safe-restart completed: $(date) ====="
Make executable:
sudo tee /usr/local/bin/dspace-safe-restart.sh >/dev/null <<'EOF'
# (paste script here)
EOF
sudo chmod +x /usr/local/bin/dspace-safe-restart.sh
Run with:
sudo /usr/local/bin/dspace-safe-restart.sh
6 β Forward-looking (prevent recurrence)
-
Use only one method to manage Solr β either
systemctlor/opt/solr/bin/solr. Prefersystemctlfor production. -
Enable systemd auto-restart for solr/tomcat:
# example override sudo mkdir -p /etc/systemd/system/solr.service.d sudo tee /etc/systemd/system/solr.service.d/override.conf >/dev/null <<'EOF' [Service] Restart=on-failure RestartSec=5 EOF sudo systemctl daemon-reload sudo systemctl enable solr -
Increase Solr ulimits (the warning you saw). Add limits in
/etc/security/limits.conf:solr soft nproc 65000 solr hard nproc 65000And adjust systemd unit if required (use
LimitNPROC=65000in systemd override). -
Enable simple monitoring: a cron job or monitoring tool that calls
http://localhost:8080/server/api/and alerts if non-200. -
Avoid
pkill -f solrβ itβs blunt and can leave locks. Use proper stop commands. -
Daily health check script (optional): curl checks for Solr and API, run from cron to notify on failure.
-
Keep backups of edited config files (
submission-forms.xml) before changing (e.g.,cp submission-forms.xml submission-forms.xml.bak.$(date +%F_%T)).
7 β Bash history (how to check & enable timestamps)
To view the current shell history:
history | tail -n 200
To view root user saved history:
sudo cat /root/.bash_history | tail -n 200
To enable timestamps for history going forward (add to /root/.bashrc and any dspace user shells):
# add these lines to ~/.bashrc
export HISTTIMEFORMAT="%F %T "
export HISTSIZE=10000
export HISTFILESIZE=20000
shopt -s histappend
PROMPT_COMMAND='history -a; history -n;'
This ensures future commands have timestamps and are appended in real time.
8 β Quick Apache proxy snippet (for reference)
Make sure your Apache proxy block for the API is correct:
ProxyPreserveHost On
ProxyRequests Off
ProxyPass /server/api http://localhost:8080/server/api
ProxyPassReverse /server/api http://localhost:8080/server/api
# optional
ProxyTimeout 60
Validate with:
sudo apachectl -t
sudo systemctl reload apache2
TL;DR checklist you can paste on a sticky note
curl -I http://localhost:8080/server/apisudo systemctl status solr && sudo systemctl status tomcat10- If Solr down β
sudo systemctl start solrβ wait forhttp://localhost:8983/solr/ - Then
sudo systemctl restart tomcat10β wait for API. - If ports in use:
sudo lsof -i :8983/sudo lsof -i :8080β identify PID β stop correct process. - Check logs:
/dspace/log/dspace.log,/var/log/tomcat10/catalina.out,/var/log/apache2/error.log,journalctl -u solr