LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 26476 - [mips] std::regex_traits considers '-' to be a member of the 'w' class
Summary: [mips] std::regex_traits considers '-' to be a member of the 'w' class
Status: RESOLVED FIXED
Alias: None
Product: libc++
Classification: Unclassified
Component: All Bugs (show other bugs)
Version: 3.8
Hardware: PC Linux
: P normal
Assignee: Daniel Sanders
URL:
Keywords:
Depends on:
Blocks: 26059
  Show dependency tree
 
Reported: 2016-02-04 10:54 PST by Daniel Sanders
Modified: 2016-02-17 09:04 PST (History)
3 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Sanders 2016-02-04 10:54:51 PST
It currently fails with this assertion:
   isctype.pass.cpp:30: int main(): Assertion `!t.isctype('-', t.lookup_classname(s.begin(), s.end()))' failed.
this test passed in the previous release.
Comment 1 Marshall Clow (home) 2016-02-08 10:28:31 PST
Is this the only failure in that test?
(i.e, if you comment out that assert and re-run it, does it pass)

I suspect that ctype_base (defined in __locale) is not getting built correctly for your system; there's a set of #ifdefs for different systems there, and none for mips.
Comment 2 Daniel Sanders 2016-02-10 11:13:43 PST
The 'assert(!t.isctype('@', t.lookup_classname(s.begin(), s.end())));' on line 31 and then the wchar_t version of the same two on lines 159 and 160 also fail.

I've only had a quick look at ctype_base in __locale so far but my machine should be covered by the __GLIBC__. I'll keep digging.
Comment 3 Daniel Sanders 2016-02-10 18:54:28 PST
It looks like the result of lookup_classname() is wrong. If I understand this correctly, it should return a bitmask to test against the values in classic_table.

I think this class should be digit|alpha|upper|lower but some how 'graph' has crept in too. This then matches the punct|graph|print used for the '-'.
Comment 4 Daniel Sanders 2016-02-10 19:16:04 PST
I'm fairly certain the problem is __regex_word. It's trying to use an unoccupied bit (0x80) but this bit is only unoccupied on little endian machines.

On big-endian the bytes are reversed in the enum containing _ISgraph/_ISprint/etc. (by _ISbit() in /usr/include/ctype.h) so __regex_word and _ISgraph happen to have the same value. It seems that __regex_word needs to be 0x8000 on a big-endian machine.

It's late so I'm leaving a build running overnight to confirm. I should know if this is the problem in the morning.
Comment 5 Daniel Sanders 2016-02-11 04:13:52 PST
I can confirm that this was the problem. I've posted a patch at http://reviews.llvm.org/D17132
Comment 6 Marshall Clow (home) 2016-02-11 09:24:39 PST
In r260527, I added some tests to catch this if it happens again.
Can you please update to that version and run the tests, with and without your patch?
Comment 7 Daniel Sanders 2016-02-15 09:57:54 PST
The tests you added in r260527 pass even without my patch. This is because the table is shared between C and C++ and we're colliding with the _ISgraph bit used by C rather than any of the C++ related bits.
Comment 8 Daniel Sanders 2016-02-17 09:04:09 PST
Committed an '#ifdef __mips__' version in r261088 and merged it in r261097