Releases: simdutf/simdutf
Version 8.2.0
What's Changed
Full Changelog: v8.1.0...v8.2.0
Version 8.1.0
What's New
- add simdutf::binary_length_from_base64 by @anonrig in #887
- Optimized binary_length_from_base64 functions for most kernels by @lemire in #944
What's Changed
- allow building from a tar ball by @lemire in #922 reported by @clausecker
- Faster amalgamation by @lemire in #928
- adding short input benchmarks by @lemire in #927
- improving the short bench: avoid optimization and add steps by @lemire in #934
- adds 'override' annotations for RVV, lasx and lsx. by @lemire in #931
- inlining get_default_implementation and get_single_implementation by @lemire in #932
- enchmark simdutf::find by @lemire in #933
- Add simdutf_utf8_length_from_utf32 to C API by @triallax in #937
- add more constexpr to utf8_to_utf16.h by @spkapust in #943
In Progress
This release contains an undocumented convert_utf16_to_utf8_with_replacement function by @mertcanaltin (in #936). It is not yet part of our public API. We will optimize the implementation in future releases.
New Contributors
- @mertcanaltin made their first contribution in #936
- @triallax made their first contribution in #937
- @spkapust made their first contribution in #943
Full Changelog: v8.0.0...v8.1.0
Version 8.0.0
Major changes
The major change in this release is that now most simdutf functions are immediate functions (constexpr), i.e., they can be executed at compile time. Thus, for example, you can validate that a string is proper UTF-8 at compile time:
static_assert(simdutf::validate_utf8(s));
The constexpr interface requires C++23. (You can still use simdutf with C++11.)
Another major change is the introduction of a C API. You can now easily call simdutf from C (although you still need to link against a C++ library, either statically or at runtime). This C API should make it easier to write wrappers to simdutf from other programming languages. We now include a C header as part of our releases.
What's Changed
- make most of simdutf constexpr by @pauldreik in #868
- add C API by @lemire #897
- fix bug in utf8_length_from_utf16() on big endian by @pauldreik in #884
- interop with std::text_encoding by @shikharish in #881
- polish the functions in encoding_types.h by @pauldreik in #894
- ensure the override keyword is used everywhere by @pauldreik in #906
- Fix heap-buffer-overflow in convert_utf16_to_utf8_safe by @OwenSanzas in #912
- fixing issue 914 (convert_latin1_to_utf8_safe is missing from the C api) by @lemire in #915
- Fix typos by @kianmeng in #888
- Update/optimize Loongson kernels by @lemire #885
Because we fixed a couple of bugs, including a potential buffer overflow in convert_utf16_to_utf8_safe , we recommend that all users of the library update to 8.0.0. There are no breaking changes.
Infrastructure changes
- oss-fuzz: Add unit testing build in oss-fuzz build script by @arthurscchan in #880
- fixes the check_feature_macros script. by @lemire in #889
- add ci job for checking typos by @pauldreik in #900
- add fuzzer for convert_*_safe() functions by @pauldreik in #916
- strengthen the safe_conversion fuzzer by @pauldreik in #918
New Contributors
- @shikharish made their first contribution in #881
- @arthurscchan made their first contribution in #880
- @kianmeng made their first contribution in #888
- @OwenSanzas made their first contribution in #912
Full Changelog: v7.7.1...v8.0.0
Version 7.7.1
What's Changed
- Do not use include inside our namespaces by @lemire in #870
- add simdutf constexpr more thoroughly by @lemire in #864
- optimize utf16 validation on icelake by @anonrig in #873
- optimize utf32 validation on icelake by @anonrig in #872
- Fix aarch64 constexpr build error by @pauldreik in #875
- introduce cmake option SIMDUTF_FAST_TESTS by @pauldreik in #876
- better documentation for maximal_binary_length_from_base64 by @lemire in #871
- Treat C++20 char8_t as byte-like by @leezaj in #877
- Include validate_utf16le_as_ascii inside UTF16 and ASCII features by @leezaj in #878
- Improving the performance of validate_ascii by @lemire in #879 credit to @ChALkeR for raising the issue
New Contributors
Full Changelog: v7.7.0...v7.7.1
Version 7.7.0
utf8_length_from_utf16_with_replacement. cc @anonrig We allow the breaking change on the assumption that nobody has had time to use our new function and if they do, the patch is simple (trivial). It is agains the practice of simdutf to introduce such breaking changes, so it is an exception.
What's Changed
- Return more information from
utf8_length_from_utf16_with_replacementby @erikcorry in #860
New Contributors
Full Changelog: v7.6.0...v7.7.0
Version 7.6.0
What's Changed
- support reproducibility for debug sources by @hongxu-jia in #848
- Add --filter option to only run matching benchmarks by @erikcorry in #858
- UTF16 to UTF8 length with replacement by @lemire and @erikcorry in #851
New Contributors
- @davidfetter made their first contribution in #846
- @hongxu-jia made their first contribution in #848
Full Changelog: v7.5.0...v7.6.0
Version 7.5.0
What's Changed
- Implement rvv validate_utf16_as_ascii function by @tantei3 in #836
- Enable SIMD generic validate_utf16_as_ascii for lsx + lasx + ppc64 by @tantei3 in #837
- Implement to_well_formed_utf16 for rvv by @tantei3 in #838
- Typo fix: parem to param by @jasseeeem in #841
- utf16fix_block_rvv: improve mask shift by @camel-cdr in #842
- converting binary data to base64 with lines by @lemire in #840
New Contributors
- @jasseeeem made their first contribution in #841
Full Changelog: v7.4.0...v7.5.0
Version 7.4.0
What's Changed
- improving support for legacy GCC and validate_utf16_as_ascii by @lemire in #833 This fixes both #832 and #831
The new feature of this minor release is that we can check whether an UTF-16 string is 'ASCII' meaning that it can be converted to ASCII without any loss. This was requested by @trflynn89 of the Ladybird project.
/**
* Validate the ASCII string as a UTF-16 sequence.
* An UTF-16 sequence is considered an ASCII sequence
* if it could be converted to an ASCII string losslessly.
*
* Overridden by each implementation.
*
* @param buf the UTF-16 string to validate.
* @param len the length of the string in bytes.
* @return true if and only if the string is valid ASCII.
*/
simdutf_warn_unused bool validate_utf16_as_ascii(const char16_t *buf,
size_t len) noexcept;Full Changelog: v7.3.6...v7.4.0
Version 7.3.6
What's Changed
This patch should only concern users of the trim_partial_utf16 function.
Full Changelog: v7.3.5...v7.3.6
Version 7.3.5
What's Changed
- Improving the performance of simdutf::find and adding a benchmark for simdutf::find.