New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ColumnLimit check for trailing comments alignment acts wrong for multi-byte UTF-8 #47624
Comments
Fixes llvm#47333. Fixes llvm#47624. Fixes llvm#75929. Fixes llvm#87885. Fixes llvm#89916.
i wonder how this happens, as if we count 2 bytes instead of 1 then shouldnt clang-format be breaking at around 40'th column since it was counting 2 bytes as per raw byte count ? more specifically i would probably expect it to maybe be like the following comment |
i have done some character math kung-fu and come up with my own line breaking which i think should have been done had clang actually formatted on a per-byte-count basis
|
do note however, that as per
the following letters consume part of the boundary
thus following
|
also note that since clang-format DID NOT output such format, it either did not truly work at 1-byte granularity as you suggested, or it did something else |
Fixes llvm#37705. Fixes llvm#47333. Fixes llvm#47624. Fixes llvm#58850. Fixes llvm#75929. Fixes llvm#87885. Fixes llvm#89916.
Extended Description
.clang-format:
What we have upon clang-format:
What we should have:
Russian trailing comments go as UTF-8 2-byte characters, and, obviously, clang-format counts their length as raw byte count when checking if line is exceeded. As a result, comments fall back closer to code, while still having enough space for being aligned.
This is relevant only for trailing comment alignment tirgger check. Line break upon exceeding 80 character limit works correctly for multi-byte characters.
The text was updated successfully, but these errors were encountered: