Here's the source code for GNU wc -
https://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob;... .
It's easy to see that it's not the result of 10+ years of low-level optimizations to eek out the most performance.
Your test code probably hits the MB_CUR_MAX>1 path at line 361. (Check your locale setting!)
The main loop is:
402 if (!in_shift && is_basic (*p))
403 {
404 /* Handle most ASCII characters quickly, without calling
405 mbrtowc(). */
406 n = 1;
407 wide_char = *p;
408 wide = false;
409 }
...
443 switch (wide_char)
444 {
445 case '\n':
446 lines++;
447 FALLTHROUGH;
448 case '\r':
449 case '\f':
450 if (linepos > linelength)
451 linelength = linepos;
452 linepos = 0;
453 goto mb_word_separator;
454 case '\t':
455 linepos += 8 - (linepos % 8);
456 goto mb_word_separator;
457 case ' ':
458 linepos++;
459 FALLTHROUGH;
460 case '\v':
461 mb_word_separator:
462 words += in_word;
463 in_word = false;
464 break;
465 default:
466 if (wide && iswprint (wide_char))
....
480 else if (!wide && isprint (to_uchar (*p)))
481 {
482 linepos++;
483 if (isspace (to_uchar (*p)))
484 goto mb_word_separator;
485 in_word = true;
486 }
487 break;
488 }
This is a much more complicated implementation than your code. Among other things, note how it uses isprint/iswprint on each character, and how these are locale dependent.
Even in when character = byte, the main loop uses the same logic:
555 default:
556 if (isprint (to_uchar (p[-1])))
557 {
558 linepos++;
559 if (isspace (to_uchar (p[-1]))
560 || isnbspace (to_uchar (p[-1])))
561 goto word_separator;
562 in_word = true;
563 }
564 break;
565 }
Your benchmark only uses the characters:
"\n\r ',-./0123456789ABCDEFGHIJKLMNOPQRSTUVYZabcdefghijklmnopqrstuvwxyz"
which means it comes nowhere near being a good test which verifies the two programs do the same thing.
The following should be a more difficult test set to reproduce wc's output. I create the test set with Python:
with open("testset.dat", "wb") as f:
for b1 in range(256):
for b2 in range(256):
for b3 in range(256):
_ = f.write(bytes((b1, b2, b3)))
then print the output using two different locales:
% env LANG=en_US.UTF-8 wc testset.dat
196608 1523713 50331648 testset.dat
% env LANG=C wc testset.dat
196608 1152001 50331648 testset.dat