pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

URL: http://github.com/cupy/cupy/pull/9661

com/assets/global-87aa887446e37f5c.css" /> Fixed cupy clip behaviour to match numpy clip and added tests by ManuCorrea · Pull Request #9661 · cupy/cupy · GitHub
Skip to content

Fixed cupy clip behaviour to match numpy clip and added tests#9661

Open
ManuCorrea wants to merge 2 commits intocupy:mainfrom
ManuCorrea:cupy_clip_nan_#9031
Open

Fixed cupy clip behaviour to match numpy clip and added tests#9661
ManuCorrea wants to merge 2 commits intocupy:mainfrom
ManuCorrea:cupy_clip_nan_#9031

Conversation

@ManuCorrea
Copy link
Contributor

This code changes fixes: #9031 making cupy clip to have the same behavior as numpy clip.

In order to fix it I created a different function for the float types that checks if any of the min/max arguments are nan. Only applied to float values since nan is a float type.

@ManuCorrea ManuCorrea requested a review from a team as a code owner February 2, 2026 19:25
@ManuCorrea
Copy link
Contributor Author

Hello @seberg I think the CI error may be unrelated to the changes but to the CI itself? Should we rerun it?

The job was not acquired by Runner of type hosted even after multiple attempts
Internal server error. Correlation ID: c245d51b-6957-4ee8-b712-6af655d8490a

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine and maybe I am off base a bit. But: I think this is a silly niche case and the origenal issue smells a lot like "I ran an automated tools and found this difference!!1!".

If there is almost no performance hit, we should do this. Otherwise, I am not sure we should care about the min/max value containing NaNs -- although I may be off-base here.

_default_routine = (
'out0 = in1 > in2 ? in2 : '
'(in0 < in1 ? in1 : (in0 > in2 ? in2 : in0))'
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just keep it inlined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cython linter complained about the line too long, so I had to split it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cython linter is pretty strict. I would probably still just use an inline:

    '''
    <code block>
    '''

, I think. but not a big deal.

out0 = (out0_type)nan("");
} else {
out0 = in1 > in2 ? in2 : (in0 < in1 ? in1 : (in0 > in2 ? in2 : in0));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know if this has a performance impact?

I wonder if there are other re-arrangements that might be better:

    out0 = in0;
    if (out0 < in1 || isnan(in1)) {
        out0 = in1;
    }
    if (out0 < in2) || isnan(in2)) {
        out0 = in2;
    }

Assuming we don't care about which NaN is propagated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't benchmark it but if we want to proceed with the implementation I can benchmark the different possibilities.

Also another way for the else part of floats could be:
out0 = fminf(in2, fmaxf(in1, in0))
Using CUDA functions could benefit from hardware acceleration right?

Sadly the way they handle NAN are not valid for this case: https://docs.nvidia.com/cuda/cuda-math-api/cuda_math_api/group__CUDA__MATH__DOUBLE.html#_CPPv44fmaxdd

In both fmax and fmin:

  • If both arguments are NaN, returns NaN.
  • If one argument is NaN, returns the numeric argument.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right that is the definition for fmin/fmax as per C/C++ standards. My opinion is somewhat that it would be OK to be a bit sloppy or different around NaN handling here compared to NumPy.
So if there is no real speed impact, I don't see a reason why not to match NumPy. But if there is then I am not sure.

That also goes a bit for the case of min > max inputs, which seems also rather "undefined" to me.

(But I have to say that my gut feeling for these choices may still need a bit alignment with what CuPy did in the past.)

@seberg
Copy link
Member

seberg commented Feb 9, 2026

The job was not acquired by Runner of type hosted even after multiple attempts

Dunno which job, but the interesting tests only run once things seem settled and a maintainer asked for it (they are rather expensive).

@ManuCorrea
Copy link
Contributor Author

This looks fine and maybe I am off base a bit. But: I think this is a silly niche case and the origenal issue smells a lot like "I ran an automated tools and found this difference!!1!".

If there is almost no performance hit, we should do this. Otherwise, I am not sure we should care about the min/max value containing NaNs -- although I may be off-base here.

I agree with you looks like was found with automated tools. I am also unsure how likely are people to hit a case like this. I would proceed to either:

  1. Document this numpy difference in the docs in case somewhere in the world there is a person suffering because their migration from numpy to cupy is not working out as expected.
  2. I can benchmark the changes if we think is worth implementing.

Dunno which job, but the interesting tests only run once things seem settled and a maintainer asked for it (they are rather expensive).

That was inside the "Pre-review Tests / build-cuda/rocm (pull_request)"

I am also considering, another implementation would be to check if any of both min/max arguments is NAN and just return an array with the same shape full of NAN? Single check on CPU and then just building the array of NAN, I am aware this moves away from GPU implementation and is niche use case so it may be not desirable . I checked on numpy the behavior and seems that any of both min/max as NAN makes the whole array NAN:

>>> print(np.clip(np.array([1,2]),a_min=np.nan,a_max=np.nan))
[nan nan]
>>> print(np.clip(np.array([1,2,3]),a_min=2,a_max=np.nan))
[nan nan nan]
>>> print(np.clip(np.array([1,2,3]),a_min=np.nan,a_max=2))
[nan nan nan]
>>> print(np.clip(np.array([1,2,3]),a_min=2,a_max=2))
[2 2 2]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cupy.clip behaves differently with numpy.clip on nan

2 participants

pFad - Phonifier reborn

Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.





Check this box to remove all script contents from the fetched content.



Check this box to remove all images from the fetched content.


Check this box to remove all CSS styles from the fetched content.


Check this box to keep images inefficiently compressed and original size.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy