0 votes
156 views
in Cloud by

I am to get the warning as "1 daemons have recently crashed" error in the ceph health status
 

The full message "1 daemons have recently crashed osd.9 crashed on host prox-node4"
 

How to recover it ?

2 Answers

0 votes
by

You can use the command ceph crash ls to find the crashed OSD
 

Then use ceph crash archive <id> or ceph crash archive-all

0 votes
by
The "ceph crash archive-all" command in Ceph is used to create a compressed tar archive of all crash dump files on the cluster.

Crash dump files contain information about a crashed daemon, such as a coredump, stack trace, and log files, that can help diagnose the cause of the crash.

When you run "ceph crash archive-all", Ceph will create a compressed tar archive of all crash dump files on the cluster and store it in the "/var/lib/ceph/crash" directory on the monitor node.

This command is useful for collecting crash information across the cluster for further analysis, troubleshooting, or reporting. You can use the archived crash dumps to diagnose the cause of a crash or to share with the Ceph community or support team for assistance.

It's important to note that you should always have a backup strategy in place to minimize any potential data loss or downtime caused by daemon crashes or other issues. With Ceph, it's particularly important to ensure that you have proper data replication and recovery mechanisms in place to ensure the integrity of your data in the event of a crash or other issue.
...