nvme_qpair.c - OpenGrok history log for /freebsd/sys/dev/nvme/nvme

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
# 3d89acf5	15-Aug-2024	Warner Losh <imp@FreeBSD.org>	nvme: Separate total failures from I/O failures When it's a I/O failure, we can still send admin commands. Separate out the admin failures and flag them as such so that we can still send admin comma nvme: Separate total failures from I/O failures When it's a I/O failure, we can still send admin commands. Separate out the admin failures and flag them as such so that we can still send admin commands on half-failed drives. Fixes: 9229b3105d88 (nvme: Fail passthrough commands right away in failed state) Sponsored by: Netflix show more ...
# ce7fac64	16-Aug-2024	Warner Losh <imp@FreeBSD.org>	Revert "nvme: Separate total failures from I/O failures" All kinds of crazy stuff was mixed into this commit. Revert it and do it again. This reverts commit d5507f9e436698ac17dc5ace7ef58493988a9b04 Revert "nvme: Separate total failures from I/O failures" All kinds of crazy stuff was mixed into this commit. Revert it and do it again. This reverts commit d5507f9e436698ac17dc5ace7ef58493988a9b04. Sponsored by: Netflix show more ...
# d5507f9e	15-Aug-2024	Warner Losh <imp@FreeBSD.org>	nvme: Separate total failures from I/O failures When it's a I/O failure, we can still send admin commands. Separate out the admin failures and flag them as such so that we can still send admin comma nvme: Separate total failures from I/O failures When it's a I/O failure, we can still send admin commands. Separate out the admin failures and flag them as such so that we can still send admin commands on half-failed drives. Fixes: 9229b3105d88 (nvme: Fail passthrough commands right away in failed state) Sponsored by: Netflix show more ...
# bb7f7d5b	24-Jul-2024	Warner Losh <imp@FreeBSD.org>	nvme: Warn if there's system interrupt issues. Issue a warning if we have system interrupt issues. If you get this warning, then we submitted a request, it timed out without an interrupt being poste nvme: Warn if there's system interrupt issues. Issue a warning if we have system interrupt issues. If you get this warning, then we submitted a request, it timed out without an interrupt being posted, but when we polled the card's completion, we found completion events. This indicates that we're missing interrupts, and to date all the times I've helped people track issues like this down it has been a system issue, not an NVMe driver isseue. Sponsored by: Netflix Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D46031 show more ...
# aa413543	24-Jul-2024	Warner Losh <imp@FreeBSD.org>	nvme: Optimize timeout code further Optimize timeout code based on three observations. (1) The tr queues are sorted in order of submission, so the first one that could time out is the first "re nvme: Optimize timeout code further Optimize timeout code based on three observations. (1) The tr queues are sorted in order of submission, so the first one that could time out is the first "real" one on the list. (2) Timeouts for a given queue are all the same length (well, except at startup, where timeout doesn't matter, and when you change it at runtime, where timeouts will still happen eventually and the difference isn't worth optimizing for). (3) Calling the ISR races the real ISR and we should avoid that better. So now, after checking to see if the card is there and working, the timeout routine scans the pending tracker list until it finds a non-AER tracker. If the deadline hasn't passed, we return, doing nothing further. Otherwise, we call poll completions and then process the list looking for timed out items. This should move the timeout routine to touching hardware only when it's really necessary. It thus avoids racing the normal ISR, while still timig out stuck transactions quickly enough. There was also some minor code motion to make all of the above flow more nicely for the reader. When interrupts aren't working at all, then this will increase latency somewhat. But when interrupts aren't working at all, there's bigger problems and we should poll quite often in that case. That will be handled in future commits. Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D46026 show more ...
# e6d3ba4b	24-Jul-2024	Warner Losh <imp@FreeBSD.org>	nvme: Lock when processing an abort completion command. When processing an abort completion command, we have to lock. But we have to lock the qpair of the original transaction (not the abort we're c nvme: Lock when processing an abort completion command. When processing an abort completion command, we have to lock. But we have to lock the qpair of the original transaction (not the abort we're completing). We do this to avoid races with checking the completion id to tr mapping array, as well as to manually complete it. Note: we don't handle the completion status of 'Asked to abort too many transactions at once.' That will be fixed on subsequent commits. Add a note to that effect for now since it's a harder problem to solve. Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D46025 show more ...
# 86909f7a	24-Jul-2024	Warner Losh <imp@FreeBSD.org>	nvme: Always lock and only avoid processing for recovery state When we lose a race with the timeout code, shift towards waiting for that timeout code to complete so we can acquire the lock. This way nvme: Always lock and only avoid processing for recovery state When we lose a race with the timeout code, shift towards waiting for that timeout code to complete so we can acquire the lock. This way we can make sure we're in 'normal' mode before processing I/O completions. If we're not in 'normal' mode, then we're resetting and we should avoid completions. Sponsored by: Netflix Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D46024 show more ...
# 123e2906	20-Jul-2024	Warner Losh <imp@FreeBSD.org>	nvme: widen nvme_qpair_manual_complete_request for better errors Make nvme_qpair_manual_complete_request take dnr as well as a print_on_error action. Make the status word computation common between nvme: widen nvme_qpair_manual_complete_request for better errors Make nvme_qpair_manual_complete_request take dnr as well as a print_on_error action. Make the status word computation common between it and nvme_qpair_manual_complete_tracker. And print the error when we are cancelling the I/O on failure, but not when we're filtering the I/O after we've failed. Make it private again to nvme_qpair.c. Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D46049 show more ...
# 9229b310	20-Jul-2024	Warner Losh <imp@FreeBSD.org>	nvme: Fail passthrough commands right away in failed state. When the drive is failed, we can't send passthrough commands to the card, so fail them right away. Rearrange the comments to reflect the c nvme: Fail passthrough commands right away in failed state. When the drive is failed, we can't send passthrough commands to the card, so fail them right away. Rearrange the comments to reflect the current failure paths in the driver. Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D46048 show more ...
Revision tags: release/14.1.0
# 0dd84c3b	13-May-2024	Warner Losh <imp@FreeBSD.org>	nvme: Add comment about where tr->deadline is set It's easy to overlook the chain of events that lead to tr->deadline being updated. Add a comment here to explain what otherwise looks like an oversi nvme: Add comment about where tr->deadline is set It's easy to overlook the chain of events that lead to tr->deadline being updated. Add a comment here to explain what otherwise looks like an oversight w/o careful study. Sponsored by: Netflix show more ...
# c931cf6a	13-May-2024	Warner Losh <imp@FreeBSD.org>	nvme: Slight simplification We don't need to dereference qpair to get the ctrlr pointer each time, so use the cached value. It's not going to change. No change intended. Sponsored by: Netflix
# 9db8ca92	13-May-2024	Warner Losh <imp@FreeBSD.org>	nvme: Slight reworking this loop to match FreeBSD style Update the comment for the code, and slightly rework the code in the 'fast exit' paradigm that FreeBSD generally tries to do. Sponsored by: nvme: Slight reworking this loop to match FreeBSD style Update the comment for the code, and slightly rework the code in the 'fast exit' paradigm that FreeBSD generally tries to do. Sponsored by: Netflix show more ...
# 5a178b83	13-May-2024	Warner Losh <imp@FreeBSD.org>	nvme: Add locking asserts nvme_qpair_complete_tracker and nvme_qpair_manual_complete_tracker have to be called without the qpair lock, so assert its unowned. Sponsored by: Netflix
Revision tags: release/13.3.0
# 5650bd3f	29-Jan-2024	John Baldwin <jhb@FreeBSD.org>	nvme: Use the NVMEF macro to construct fields Reviewed by: chuck, imp Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D43605
# 479680f2	29-Jan-2024	John Baldwin <jhb@FreeBSD.org>	nvme: Use the NVMEV macro instead of expanded versions Reviewed by: chuck Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D43595
# fdafd315	24-Nov-2023	Warner Losh <imp@FreeBSD.org>	sys: Automated cleanup of cdefs and other formatting Apply the following automated changes to try to eliminate no-longer-needed sys/cdefs.h includes as well as now-empty blank lines in a row. Remov sys: Automated cleanup of cdefs and other formatting Apply the following automated changes to try to eliminate no-longer-needed sys/cdefs.h includes as well as now-empty blank lines in a row. Remove /^#if.\n#endif.\n#include\s+<sys/cdefs.h>.\n/ Remove /\n+#include\s+<sys/cdefs.h>.\n+#if.\n#endif.\n+/ Remove /\n+#if.\n#endif.\n+/ Remove /^#if.\n#endif.\n/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/ Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/ Sponsored by: Netflix show more ...
Revision tags: release/14.0.0
# 8d6c0743	06-Nov-2023	Alexander Motin <mav@FreeBSD.org>	nvme: Introduce longer timeouts for admin queue KIOXIA CD8 SSDs routinely take ~25 seconds to delete non-empty namespace. In some cases like hot-plug it takes longer, triggering timeout and control nvme: Introduce longer timeouts for admin queue KIOXIA CD8 SSDs routinely take ~25 seconds to delete non-empty namespace. In some cases like hot-plug it takes longer, triggering timeout and controller resets after just 30 seconds. Linux for many years has separate 60 seconds timeout for admin queue. This patch does the same. And it is good to be consistent. Sponsored by: iXsystems, Inc. Reviewed by: imp MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D42454 show more ...
# afc3d49b	10-Oct-2023	Warner Losh <imp@FreeBSD.org>	nvme: Close a race in destroying qpair and timeouts While we should have cleared all the pending I/O prior to calling nvme_qpair_destroy, which should ensure that if the callout_drain causes a call nvme: Close a race in destroying qpair and timeouts While we should have cleared all the pending I/O prior to calling nvme_qpair_destroy, which should ensure that if the callout_drain causes a call to nvme_qpair_timeout(), it won't schedule any new timeout. However, it doesn't hurt to set timeout_pending to false in nvme_qpair_destroy() and have nvme_qpair_timeout() exit early if it sees it w/o scheduling a timeout. Since we don't otherwise stop the timeout until we're about to destroy the qpair, this ensures we fail safe. The lock/unlock also ensures the callout_drain will either remove the callout, or wait for it to run with the early bailout. We can likely further improve this by using callout_stop() inside the pending lock. I'll investigate that for future refinement. Sponsored by: Netflix Suggestions by: jhb Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D42065 show more ...
# 9cd7b624	10-Oct-2023	Warner Losh <imp@FreeBSD.org>	nvme: Eliminate RECOVERY_FAILED state While it seemed like a good idea to have this state, we can do everything we wanted with the state by checking ctrlr->is_failed since that's set before we start nvme: Eliminate RECOVERY_FAILED state While it seemed like a good idea to have this state, we can do everything we wanted with the state by checking ctrlr->is_failed since that's set before we start failing the qpairs. Add some comments about racing when we're failing the controller, though in practice I'm not sure that kind of race could even be lost. Sponsored by: Netflix Reviewed by: chuck, gallatin, jhb Differential Revision: https://reviews.freebsd.org/D42051 show more ...
# 1d6021cd	26-Sep-2023	Warner Losh <imp@FreeBSD.org>	nvme: Supress noise messages When we're suspending, we get messages about waiting for the controller to reset. These are in error: we're not waiting for it to reset. We put the recovery state as par nvme: Supress noise messages When we're suspending, we get messages about waiting for the controller to reset. These are in error: we're not waiting for it to reset. We put the recovery state as part of suspending, so we should suppress these as a false positive. Also remove a stray debug that's left over from earlier versions of the recovery code that no longer makes sense. Sponsored by: Netflix show more ...
# da8324a9	24-Sep-2023	Warner Losh <imp@FreeBSD.org>	nvme: Fix locking protocol violation to fix suspend / resume Currently, when we suspend, we need to tear down all the qpairs. We call nvme_admin_qpair_abort_aers with the admin qpair lock held, but nvme: Fix locking protocol violation to fix suspend / resume Currently, when we suspend, we need to tear down all the qpairs. We call nvme_admin_qpair_abort_aers with the admin qpair lock held, but the tracker it will call for the pending AER also locks it (recursively) hitting an assert. This routine is called without the qpair lock held when we destroy the device entirely in a number of places. Add an assert to this effect and drop the qpair lock before calling it. nvme_admin_qpair_abort_aers then locks the qpair lock to traverse the list, dropping it around calls to nvme_qpair_complete_tracker, and restarting the list scan after picking it back up. Note: If interrupts are still running, there's a tiny window for these AERs: If one fires just an instant after we manually complete it, then we'll be fine: we set the state of the queue to 'waiting' and we ignore interrupts while 'waiting'. We know we'll destroy all the queue state with these pending interrupts before looking at them again and we know all the TRs will have been completed or rescheduled. So either way we're covered. Also, tidy up the failure case as well: failing a queue is a superset of disabling it, so no need to call disable first. This solves solves some locking issues with recursion since we don't need to recurse.. Set the qpair state of failed queues to RECOVERY_FAILED and stop scheduling the watchdog. Assert we're not failed when we're enabling a qpair, since failure currently is one-way. Make failure a little less verbose. Next, kill the pre/post reset stuff. It's completely bogus since we disable the qparis, we don't need to also hold the lock through the reset: disabling will cause the ISR to return early. This keeps us from recursing on the recovery lock when resuming. We only need the recovery lock to avoid a specific race between the timer and the ISR. Finally, kill NVME_RESET_2X. It'S been a major release since we put it in and nobody has used it as far as I can tell. And it was a motivator for the pre/post uglification. These are all interrelated, so need to be done at the same time. Sponsored by: Netflix Reviewed by: jhb Tested by: jhb (made sure suspend / resume worked) MFC After: 3 days Differential Revision: https://reviews.freebsd.org/D41866 show more ...
# d9543162	15-Sep-2023	Warner Losh <imp@FreeBSD.org>	nvme: Give up when we've failed Normally, we poll the device every so often to see if commands have timed out. However, we'll go into the recovery state as part of failing the drive. To account for nvme: Give up when we've failed Normally, we poll the device every so often to see if commands have timed out. However, we'll go into the recovery state as part of failing the drive. To account for all possibilties, if we're failed when we get into the polling function, just stop polling: Party is over. Sponsored by: Netflix show more ...
# 8052b01e	25-Aug-2023	Warner Losh <imp@FreeBSD.org>	nvme: Add exclusion for ISR Add a basically uncontended spinlock that we take out while the ISR is running. This has two effects: First, when we get a timeout, we can safely call the nvme_qpair_proc nvme: Add exclusion for ISR Add a basically uncontended spinlock that we take out while the ISR is running. This has two effects: First, when we get a timeout, we can safely call the nvme_qpair_process_completions w/o racing any ISRs. Second, we can use it to ensure that we don't reset the card while the ISRs are active (right now we just sleep and hope for the best, which usually is fine, but not always). Sponsored by: Netflix MFC After: 2 weeks Reviewed by: chuck, gallatin Differential Revision: https://reviews.freebsd.org/D41452 show more ...
# d4959bfc	25-Aug-2023	Warner Losh <imp@FreeBSD.org>	nvme: Greatly improve error recovery Next phase of error recovery: Eliminate the REOVERY_START phase, since we don't need to wait to start recovery. Eliminate the RECOVERY_RESET phase since it is tr nvme: Greatly improve error recovery Next phase of error recovery: Eliminate the REOVERY_START phase, since we don't need to wait to start recovery. Eliminate the RECOVERY_RESET phase since it is transient, we now transition from RECOVERY_NORMAL into RECOVERY_WAITING. In normal mode, read the status of the controller. If it is in failed state, or appears to be hot-plugged, jump directly to reset which will sort out the proper things to do. This will cause all pending I/O to complete with an abort status before the reset. When in the NORMAL state, call the interrupt handler. This will complete all pending transactions when interrupts are broken or temporarily misbehaving. We then check all the pending completions for timeouts. If we have abort enabled, then we'll send an abort. Otherwise we'll assume the controller is wedged and needs a reset. By calling the interrupt handler here, we'll avoid an issue with the current code where we transitioned to RECOVERY_START which prevented any completions from happening. Now completions happen. In addition and follow-on I/O that is scheduled in the completion routines will be submitted, rather than queued, because the recovery state is correct. This also fixes a problem where I/O would timeout, but never complete, leading to hung I/O. Resetting remains the same as before, just when we chose to reset has changed. A nice side effect of these changes is that we now do I/O when interrupts to the card are totally broken. Followon commits will improve the error reporting and logging when this happens. Performance will be aweful, but will at least be minimally functional. There is a small race when we're checking the completions if interrupts are working, but this is handled in a future commit. Sponsored by: Netflix MFC After: 2 weeks Differential Revision: https://reviews.freebsd.org/D36922 show more ...
# 2a6b7055	25-Aug-2023	Warner Losh <imp@FreeBSD.org>	nvme: Timeout expired transactions When we went to having a shared timeout routine, failing the timed-out transaction code was inadvertantly dropped. Reinstate it. Fixes: 502dc84a8b670 Sponsored nvme: Timeout expired transactions When we went to having a shared timeout routine, failing the timed-out transaction code was inadvertantly dropped. Reinstate it. Fixes: 502dc84a8b670 Sponsored by: Netflix MFC After: 2 weeks Reviewed by: chuck, jhb Differential Revision: https://reviews.freebsd.org/D36921 show more ...
12 3 4 5 6 7 8