Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fuzz clang-format #23426

Open
kcc opened this issue Mar 28, 2015 · 10 comments
Open

fuzz clang-format #23426

kcc opened this issue Mar 28, 2015 · 10 comments
Labels
bugzilla Issues migrated from bugzilla clang-format

Comments

@kcc
Copy link
Contributor

kcc commented Mar 28, 2015

Bugzilla Link 23052
Version unspecified
OS Linux
CC @d0k,@KernelAddress

Extended Description

We have a fuzzer of clang-format in the source tree.
Details: llvm/lib/Fuzzer/README.txt

It has found a few bugs so far:
r226685, r226678, r226451, r226446, r226448, r227427, r226447,
r226685, r226680, r226698, r229485, r227677, r227433, r227427,
r230395, r231066, (probably missed a couple more)

There are a few remaining, we will be posting them here, one per comment.

There is also a build bot which runs the fuzzer 24/7 and will report new
bugs (regressions) if they appear or old bugs if the fuzzer discovers them.
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer

@kcc
Copy link
Contributor Author

kcc commented Mar 28, 2015

Clang-format(-fuzzer) is very slow on a tiny input.
May not be a big problem by itself (or may be it is),
but this hurts fuzzing very much. With all the fuzzer instrumentation
it takes ~1.5 second to format 60 bytes.
W.o. instrumentation it takes ~0.5 second.

cat << EOF | base64 --decode | clang-format
PDw8SAQEMigqLCioKjFoLGgKPDw8PDw8Cjw8PCxkKiQcPDw8KCosKKgiaCxoCigKCjw8PAo8PGQq
KKA6
EOF

Perf:
51.83% clang::format::(anonymous namespace)::AnnotatingParser::next()
13.12% clang::format::(anonymous namespace)::AnnotatingParser::parseParens(bool)
11.87% clang::format::(anonymous namespace)::AnnotatingParser::consumeToken()
8.32% clang::format::(anonymous namespace)::AnnotatingParser::parseAngle()
5.01% clang::getBinOpPrecedence(clang::tok::TokenKind, bool, bool)
4.90% clang::format::(anonymous namespace)::AnnotatingParser::updateParameterCount(clang::format::FormatToken*, clang::format::Format
2.27% clang::format::FormatToken::isSimpleTypeSpecifier() const

@kcc
Copy link
Contributor Author

kcc commented Mar 28, 2015

This one is worse: 31 seconds w/o instrumentation for 64 bytes, same profile.

cat << EOF | base64 --decode | clang-format
PDw8SAQEMigqLCioKDFoLGgKPDw8PDw8CjwKPDw8PEhoCjw8PBw8PDwoKiwoqCJoLGgKKAoKPDw8
Cjw8PDw8PA==
EOF

@d0k
Copy link
Member

d0k commented Mar 28, 2015

A chain of < seems to trigger superlinear runtime in the parser.

perl -e 'print "<" x 20'|clang-format

n | seconds
20 | 0.101
21 | 0.191
22 | 0.367
23 | 0.722
24 | 1.431
25 | 2.730
26 | 5.173
27 | 10.026
28 | 19.779
29 | 39.350

@kcc
Copy link
Contributor Author

kcc commented Apr 2, 2015

echo LypcAAov | base64 --decode | clang-format -

Assertion `TokenText.startswith("/") && TokenText.endswith("/")' failed.

@kcc
Copy link
Contributor Author

kcc commented Apr 2, 2015

echo PCo+Iis/J2FjIDpTDT46zvxcXAp1NzI49zxGPg== | base64 --decode | clang-format -

Assertion `EndColumn >= StartColumn' failed.

@kcc
Copy link
Contributor Author

kcc commented Apr 2, 2015

*** Bug #23294 has been marked as a duplicate of this bug. ***

@kcc
Copy link
Contributor Author

kcc commented May 6, 2015

the clang/clang-format fuzzer bot
lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer
has been extended to run both with and w/o assertions.
whenever a bug is found, the fuzzer will print the base64-encoded reproducer
so that one can copy-paste it from the buildbot logs:
E.g. from the bot logs:

SUMMARY: AddressSanitizer: ...
CRASHED; file written to crash-80193815206841682354717562770799349303
Base64: OiDgO3gKUyYhU0Z4KhFoEztFKGV1bZNTe5Hsk1MmKUMheCoTIWgTO0VTKMFldW2TUzs=

Just do this:
echo OiDgO3gKUyYhU0Z4KhFoEztFKGV1bZNTe5Hsk1MmKUMheCoTIWgTO0VTKMFldW2TUzs= | base64 -d | clang -x c++ -

@kcc
Copy link
Contributor Author

kcc commented May 6, 2015

Daniel, many thanks for the fixes.
The next biggest offender is

clang-format-fuzzer: /mnt/b/sanitizer-buildbot5/sanitizer-x86_64-linux-fuzzer/build/llvm/tools/clang/lib/Format/ContinuationIndenter.cpp:1066: unsigned int clang::format::ContinuationIndenter::breakProtrudingToken(const clang::format::FormatToken &, clang::format::LineState &, bool): Assertion `NewRemainingTokenColumns < RemainingTokenColumns' failed.

reproducer (base64-encoded):
SCQhJCwxLGNvbnN0ZQx4ciBjaHIzaDJ0IDMqMiAjJCgpABkMLTo9IGdldCxRKiJzdFwwXPSKpKQ6JFxcIg==

You may get more reproducers from the bot:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer

@llvmbot
Copy link
Collaborator

llvmbot commented Jul 21, 2015

Fixed crasher in r242738.

@kcc
Copy link
Contributor Author

kcc commented Jan 5, 2016

The clang-format-fuzzer bot has been mostly green lately,
with only one periodic assert failure, bug 26032
I've changed the bot to treat clang-format-fuzzer failures as real ones,
not just warnings.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla clang-format
Projects
None yet
Development

No branches or pull requests

4 participants