Make IsBlank matcher consistent with String.isBlank#326
Make IsBlank matcher consistent with String.isBlank#326Thorn1089 wants to merge 6 commits intohamcrest:masterfrom
Conversation
…espace' by java.lang.Character The Java regex character class \s is not consistent with Character.isWhitespace (presumably to mimic how older, non-unicode aware regular expressons worked?) This means that testing for blank strings with \s gives inconsistent results with String.isBlank, which delegates to the Character.isWhitespace implementations. This new test case demonstrates that inconsistency.
The principal of least astonishment suggests that the matcher should be consistent with the similarly named string method. The isBlank method does a simple O(N) search through the codepoints of the string and bails out at the first non-whitespace character it finds, so I can't think of any negative performance implications here. The implementation continues to pass for all original test cases.
|
Well...from Travis I learned that Hamcrest still supports Java7, hence why isBlank is not being used. Need to consider a different approach. |
String#isBlank is a newer convenience method and not supported on all target platforms
| return item.isBlank(); | ||
| final int length = item.length(); | ||
| int offset = 0; | ||
| while(offset < length) { |
There was a problem hiding this comment.
Consider sticking to the Hamcrest code style, which has a space before if and opening parenthesis.
It's not enforced in the build apparently, but still...
There was a problem hiding this comment.
Can do. Are you using an autoformatter like spotless? It has Maven/Gradle plugins. Might be worth pulling into a separate PR :)
There was a problem hiding this comment.
Thanks for fixing.
I was talking about a checker rather than a formatter.
JavaHamcrest is already using CheckStyle, but apparently it doesn't check much besides the fact that there are no tabs.
For the record: I'm not an official contributor to JavaHamcrest, I'm just a random someone with a stake in the project like you, I don't have the power to merge this PR even if I wanted to.
I imagine you inspired your implementation on Java 11 JDK's sources?
They look similar, yet a little bit different, and I wonder why...
The JDK's implementation distinguishes between Latin-1 and UTF-16 encoded strings. From what I can tell a String can be encoded in either of these 2. Your implementation only reflects the JDK's UTF-16 implementation. Is it safe to assume UTF-16? Does that also cover Latin-1? If such an assumption were safe, why does the JDK's implementation bother to make a difference?
In principle, I agree with this change, I'm just wondering if we should hold off fixing until JavaHamcrest bumps source/target compatibility to Java 11, in which case the fix will become simpler (i.e. your first commit).
| int offset = 0; | ||
| while(offset < length) { | ||
| final int codePoint = item.codePointAt(offset); | ||
| if(!Character.isWhitespace(codePoint)) { |
There was a problem hiding this comment.
Same as above: consider sticking a space between if and the opening parenthesis.
|
@peterdemaeyer looks like the builds have passed again, and the whitespace changes have been made :) |
|
Going to try and kick start hamcrest, so if you want to get it merged, please rebase from the branch |
9bc653b to
e9f7fc8
Compare
Fixes #325