History log of /freebsd/sys/dev/nvme/nvme_qpair.c (Results 1 – 25 of 178)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 3d89acf5 15-Aug-2024 Warner Losh <imp@FreeBSD.org>

nvme: Separate total failures from I/O failures

When it's a I/O failure, we can still send admin commands. Separate out
the admin failures and flag them as such so that we can still send admin
comma

nvme: Separate total failures from I/O failures

When it's a I/O failure, we can still send admin commands. Separate out
the admin failures and flag them as such so that we can still send admin
commands on half-failed drives.

Fixes: 9229b3105d88 (nvme: Fail passthrough commands right away in failed state)
Sponsored by: Netflix

show more ...


# ce7fac64 16-Aug-2024 Warner Losh <imp@FreeBSD.org>

Revert "nvme: Separate total failures from I/O failures"

All kinds of crazy stuff was mixed into this commit. Revert
it and do it again.

This reverts commit d5507f9e436698ac17dc5ace7ef58493988a9b04

Revert "nvme: Separate total failures from I/O failures"

All kinds of crazy stuff was mixed into this commit. Revert
it and do it again.

This reverts commit d5507f9e436698ac17dc5ace7ef58493988a9b04.

Sponsored by: Netflix

show more ...


# d5507f9e 15-Aug-2024 Warner Losh <imp@FreeBSD.org>

nvme: Separate total failures from I/O failures

When it's a I/O failure, we can still send admin commands. Separate out
the admin failures and flag them as such so that we can still send admin
comma

nvme: Separate total failures from I/O failures

When it's a I/O failure, we can still send admin commands. Separate out
the admin failures and flag them as such so that we can still send admin
commands on half-failed drives.

Fixes: 9229b3105d88 (nvme: Fail passthrough commands right away in failed state)
Sponsored by: Netflix

show more ...


# bb7f7d5b 24-Jul-2024 Warner Losh <imp@FreeBSD.org>

nvme: Warn if there's system interrupt issues.

Issue a warning if we have system interrupt issues. If you get this
warning, then we submitted a request, it timed out without an interrupt
being poste

nvme: Warn if there's system interrupt issues.

Issue a warning if we have system interrupt issues. If you get this
warning, then we submitted a request, it timed out without an interrupt
being posted, but when we polled the card's completion, we found
completion events. This indicates that we're missing interrupts, and to
date all the times I've helped people track issues like this down it has
been a system issue, not an NVMe driver isseue.

Sponsored by: Netflix
Reviewed by: gallatin
Differential Revision: https://reviews.freebsd.org/D46031

show more ...


# aa413543 24-Jul-2024 Warner Losh <imp@FreeBSD.org>

nvme: Optimize timeout code further

Optimize timeout code based on three observations.

(1) The tr queues are sorted in order of submission, so the first one
that could time out is the first "re

nvme: Optimize timeout code further

Optimize timeout code based on three observations.

(1) The tr queues are sorted in order of submission, so the first one
that could time out is the first "real" one on the list.
(2) Timeouts for a given queue are all the same length (well, except
at startup, where timeout doesn't matter, and when you change it
at runtime, where timeouts will still happen eventually and the
difference isn't worth optimizing for).
(3) Calling the ISR races the real ISR and we should avoid that better.

So now, after checking to see if the card is there and working, the
timeout routine scans the pending tracker list until it finds a non-AER
tracker. If the deadline hasn't passed, we return, doing nothing
further. Otherwise, we call poll completions and then process the list
looking for timed out items.

This should move the timeout routine to touching hardware only when it's
really necessary. It thus avoids racing the normal ISR, while still
timig out stuck transactions quickly enough.

There was also some minor code motion to make all of the above flow more
nicely for the reader.

When interrupts aren't working at all, then this will increase latency
somewhat. But when interrupts aren't working at all, there's bigger
problems and we should poll quite often in that case. That will be
handled in future commits.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D46026

show more ...


# e6d3ba4b 24-Jul-2024 Warner Losh <imp@FreeBSD.org>

nvme: Lock when processing an abort completion command.

When processing an abort completion command, we have to lock. But we
have to lock the qpair of the original transaction (not the abort we're
c

nvme: Lock when processing an abort completion command.

When processing an abort completion command, we have to lock. But we
have to lock the qpair of the original transaction (not the abort we're
completing). We do this to avoid races with checking the completion id
to tr mapping array, as well as to manually complete it.

Note: we don't handle the completion status of 'Asked to abort too many
transactions at once.' That will be fixed on subsequent commits. Add a
note to that effect for now since it's a harder problem to solve.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D46025

show more ...


# 86909f7a 24-Jul-2024 Warner Losh <imp@FreeBSD.org>

nvme: Always lock and only avoid processing for recovery state

When we lose a race with the timeout code, shift towards waiting for
that timeout code to complete so we can acquire the lock. This way

nvme: Always lock and only avoid processing for recovery state

When we lose a race with the timeout code, shift towards waiting for
that timeout code to complete so we can acquire the lock. This way we
can make sure we're in 'normal' mode before processing I/O
completions. If we're not in 'normal' mode, then we're resetting and we
should avoid completions.

Sponsored by: Netflix
Reviewed by: gallatin
Differential Revision: https://reviews.freebsd.org/D46024

show more ...


# 123e2906 20-Jul-2024 Warner Losh <imp@FreeBSD.org>

nvme: widen nvme_qpair_manual_complete_request for better errors

Make nvme_qpair_manual_complete_request take dnr as well as a
print_on_error action. Make the status word computation common between

nvme: widen nvme_qpair_manual_complete_request for better errors

Make nvme_qpair_manual_complete_request take dnr as well as a
print_on_error action. Make the status word computation common between
it and nvme_qpair_manual_complete_tracker. And print the error when
we are cancelling the I/O on failure, but not when we're filtering
the I/O after we've failed. Make it private again to nvme_qpair.c.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D46049

show more ...


# 9229b310 20-Jul-2024 Warner Losh <imp@FreeBSD.org>

nvme: Fail passthrough commands right away in failed state.

When the drive is failed, we can't send passthrough commands to the
card, so fail them right away. Rearrange the comments to reflect the
c

nvme: Fail passthrough commands right away in failed state.

When the drive is failed, we can't send passthrough commands to the
card, so fail them right away. Rearrange the comments to reflect the
current failure paths in the driver.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D46048

show more ...


Revision tags: release/14.1.0
# 0dd84c3b 13-May-2024 Warner Losh <imp@FreeBSD.org>

nvme: Add comment about where tr->deadline is set

It's easy to overlook the chain of events that lead to tr->deadline
being updated. Add a comment here to explain what otherwise looks like
an oversi

nvme: Add comment about where tr->deadline is set

It's easy to overlook the chain of events that lead to tr->deadline
being updated. Add a comment here to explain what otherwise looks like
an oversight w/o careful study.

Sponsored by: Netflix

show more ...


# c931cf6a 13-May-2024 Warner Losh <imp@FreeBSD.org>

nvme: Slight simplification

We don't need to dereference qpair to get the ctrlr pointer each time,
so use the cached value. It's not going to change. No change intended.

Sponsored by: Netflix


# 9db8ca92 13-May-2024 Warner Losh <imp@FreeBSD.org>

nvme: Slight reworking this loop to match FreeBSD style

Update the comment for the code, and slightly rework the code in the
'fast exit' paradigm that FreeBSD generally tries to do.

Sponsored by:

nvme: Slight reworking this loop to match FreeBSD style

Update the comment for the code, and slightly rework the code in the
'fast exit' paradigm that FreeBSD generally tries to do.

Sponsored by: Netflix

show more ...


# 5a178b83 13-May-2024 Warner Losh <imp@FreeBSD.org>

nvme: Add locking asserts

nvme_qpair_complete_tracker and nvme_qpair_manual_complete_tracker have
to be called without the qpair lock, so assert its unowned.

Sponsored by: Netflix


Revision tags: release/13.3.0
# 5650bd3f 29-Jan-2024 John Baldwin <jhb@FreeBSD.org>

nvme: Use the NVMEF macro to construct fields

Reviewed by: chuck, imp
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D43605


# 479680f2 29-Jan-2024 John Baldwin <jhb@FreeBSD.org>

nvme: Use the NVMEV macro instead of expanded versions

Reviewed by: chuck
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D43595


# fdafd315 24-Nov-2023 Warner Losh <imp@FreeBSD.org>

sys: Automated cleanup of cdefs and other formatting

Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.

Remov

sys: Automated cleanup of cdefs and other formatting

Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.

Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/

Sponsored by: Netflix

show more ...


Revision tags: release/14.0.0
# 8d6c0743 06-Nov-2023 Alexander Motin <mav@FreeBSD.org>

nvme: Introduce longer timeouts for admin queue

KIOXIA CD8 SSDs routinely take ~25 seconds to delete non-empty
namespace. In some cases like hot-plug it takes longer, triggering
timeout and control

nvme: Introduce longer timeouts for admin queue

KIOXIA CD8 SSDs routinely take ~25 seconds to delete non-empty
namespace. In some cases like hot-plug it takes longer, triggering
timeout and controller resets after just 30 seconds. Linux for many
years has separate 60 seconds timeout for admin queue. This patch
does the same. And it is good to be consistent.

Sponsored by: iXsystems, Inc.
Reviewed by: imp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D42454

show more ...


# afc3d49b 10-Oct-2023 Warner Losh <imp@FreeBSD.org>

nvme: Close a race in destroying qpair and timeouts

While we should have cleared all the pending I/O prior to calling
nvme_qpair_destroy, which should ensure that if the callout_drain causes
a call

nvme: Close a race in destroying qpair and timeouts

While we should have cleared all the pending I/O prior to calling
nvme_qpair_destroy, which should ensure that if the callout_drain causes
a call to nvme_qpair_timeout(), it won't schedule any new
timeout. However, it doesn't hurt to set timeout_pending to false in
nvme_qpair_destroy() and have nvme_qpair_timeout() exit early if it sees
it w/o scheduling a timeout. Since we don't otherwise stop the timeout
until we're about to destroy the qpair, this ensures we fail safe. The
lock/unlock also ensures the callout_drain will either remove the callout,
or wait for it to run with the early bailout.

We can likely further improve this by using callout_stop() inside the
pending lock. I'll investigate that for future refinement.

Sponsored by: Netflix
Suggestions by: jhb
Reviewed by: gallatin
Differential Revision: https://reviews.freebsd.org/D42065

show more ...


# 9cd7b624 10-Oct-2023 Warner Losh <imp@FreeBSD.org>

nvme: Eliminate RECOVERY_FAILED state

While it seemed like a good idea to have this state, we can do
everything we wanted with the state by checking ctrlr->is_failed since
that's set before we start

nvme: Eliminate RECOVERY_FAILED state

While it seemed like a good idea to have this state, we can do
everything we wanted with the state by checking ctrlr->is_failed since
that's set before we start failing the qpairs. Add some comments about
racing when we're failing the controller, though in practice I'm not
sure that kind of race could even be lost.

Sponsored by: Netflix
Reviewed by: chuck, gallatin, jhb
Differential Revision: https://reviews.freebsd.org/D42051

show more ...


# 1d6021cd 26-Sep-2023 Warner Losh <imp@FreeBSD.org>

nvme: Supress noise messages

When we're suspending, we get messages about waiting for the controller
to reset. These are in error: we're not waiting for it to reset. We put
the recovery state as par

nvme: Supress noise messages

When we're suspending, we get messages about waiting for the controller
to reset. These are in error: we're not waiting for it to reset. We put
the recovery state as part of suspending, so we should suppress these as
a false positive.

Also remove a stray debug that's left over from earlier versions of
the recovery code that no longer makes sense.

Sponsored by: Netflix

show more ...


# da8324a9 24-Sep-2023 Warner Losh <imp@FreeBSD.org>

nvme: Fix locking protocol violation to fix suspend / resume

Currently, when we suspend, we need to tear down all the qpairs. We call
nvme_admin_qpair_abort_aers with the admin qpair lock held, but

nvme: Fix locking protocol violation to fix suspend / resume

Currently, when we suspend, we need to tear down all the qpairs. We call
nvme_admin_qpair_abort_aers with the admin qpair lock held, but the
tracker it will call for the pending AER also locks it (recursively)
hitting an assert. This routine is called without the qpair lock held
when we destroy the device entirely in a number of places. Add an assert
to this effect and drop the qpair lock before calling it.
nvme_admin_qpair_abort_aers then locks the qpair lock to traverse the
list, dropping it around calls to nvme_qpair_complete_tracker, and
restarting the list scan after picking it back up.

Note: If interrupts are still running, there's a tiny window for these
AERs: If one fires just an instant after we manually complete it, then
we'll be fine: we set the state of the queue to 'waiting' and we ignore
interrupts while 'waiting'. We know we'll destroy all the queue state
with these pending interrupts before looking at them again and we know
all the TRs will have been completed or rescheduled. So either way we're
covered.

Also, tidy up the failure case as well: failing a queue is a superset of
disabling it, so no need to call disable first. This solves solves some
locking issues with recursion since we don't need to recurse.. Set the
qpair state of failed queues to RECOVERY_FAILED and stop scheduling the
watchdog. Assert we're not failed when we're enabling a qpair, since
failure currently is one-way. Make failure a little less verbose.

Next, kill the pre/post reset stuff. It's completely bogus since we
disable the qparis, we don't need to also hold the lock through the
reset: disabling will cause the ISR to return early. This keeps us from
recursing on the recovery lock when resuming. We only need the recovery
lock to avoid a specific race between the timer and the ISR.

Finally, kill NVME_RESET_2X. It'S been a major release since we put it
in and nobody has used it as far as I can tell. And it was a motivator
for the pre/post uglification.

These are all interrelated, so need to be done at the same time.

Sponsored by: Netflix
Reviewed by: jhb
Tested by: jhb (made sure suspend / resume worked)
MFC After: 3 days
Differential Revision: https://reviews.freebsd.org/D41866

show more ...


# d9543162 15-Sep-2023 Warner Losh <imp@FreeBSD.org>

nvme: Give up when we've failed

Normally, we poll the device every so often to see if commands have
timed out. However, we'll go into the recovery state as part of failing
the drive. To account for

nvme: Give up when we've failed

Normally, we poll the device every so often to see if commands have
timed out. However, we'll go into the recovery state as part of failing
the drive. To account for all possibilties, if we're failed when we get
into the polling function, just stop polling: Party is over.

Sponsored by: Netflix

show more ...


# 8052b01e 25-Aug-2023 Warner Losh <imp@FreeBSD.org>

nvme: Add exclusion for ISR

Add a basically uncontended spinlock that we take out while the ISR is
running. This has two effects: First, when we get a timeout, we can
safely call the nvme_qpair_proc

nvme: Add exclusion for ISR

Add a basically uncontended spinlock that we take out while the ISR is
running. This has two effects: First, when we get a timeout, we can
safely call the nvme_qpair_process_completions w/o racing any ISRs.
Second, we can use it to ensure that we don't reset the card while
the ISRs are active (right now we just sleep and hope for the best,
which usually is fine, but not always).

Sponsored by: Netflix
MFC After: 2 weeks
Reviewed by: chuck, gallatin
Differential Revision: https://reviews.freebsd.org/D41452

show more ...


# d4959bfc 25-Aug-2023 Warner Losh <imp@FreeBSD.org>

nvme: Greatly improve error recovery

Next phase of error recovery: Eliminate the REOVERY_START phase, since
we don't need to wait to start recovery. Eliminate the RECOVERY_RESET
phase since it is tr

nvme: Greatly improve error recovery

Next phase of error recovery: Eliminate the REOVERY_START phase, since
we don't need to wait to start recovery. Eliminate the RECOVERY_RESET
phase since it is transient, we now transition from RECOVERY_NORMAL into
RECOVERY_WAITING.

In normal mode, read the status of the controller. If it is in failed
state, or appears to be hot-plugged, jump directly to reset which will
sort out the proper things to do. This will cause all pending I/O to
complete with an abort status before the reset.

When in the NORMAL state, call the interrupt handler. This will complete
all pending transactions when interrupts are broken or temporarily
misbehaving. We then check all the pending completions for timeouts. If
we have abort enabled, then we'll send an abort. Otherwise we'll assume
the controller is wedged and needs a reset. By calling the interrupt
handler here, we'll avoid an issue with the current code where we
transitioned to RECOVERY_START which prevented any completions from
happening. Now completions happen. In addition and follow-on I/O that is
scheduled in the completion routines will be submitted, rather than
queued, because the recovery state is correct. This also fixes a problem
where I/O would timeout, but never complete, leading to hung I/O.

Resetting remains the same as before, just when we chose to reset has
changed.

A nice side effect of these changes is that we now do I/O when
interrupts to the card are totally broken. Followon commits will improve
the error reporting and logging when this happens. Performance will be
aweful, but will at least be minimally functional.

There is a small race when we're checking the completions if interrupts
are working, but this is handled in a future commit.

Sponsored by: Netflix
MFC After: 2 weeks
Differential Revision: https://reviews.freebsd.org/D36922

show more ...


# 2a6b7055 25-Aug-2023 Warner Losh <imp@FreeBSD.org>

nvme: Timeout expired transactions

When we went to having a shared timeout routine, failing the timed-out
transaction code was inadvertantly dropped. Reinstate it.

Fixes: 502dc84a8b670
Sponsored

nvme: Timeout expired transactions

When we went to having a shared timeout routine, failing the timed-out
transaction code was inadvertantly dropped. Reinstate it.

Fixes: 502dc84a8b670
Sponsored by: Netflix
MFC After: 2 weeks
Reviewed by: chuck, jhb
Differential Revision: https://reviews.freebsd.org/D36921

show more ...


12345678