Depending on the hardware that byte can be 8, 12, 24, 32, or 64 bits (in my own experience over the years).
It's sad so many here are claiming that there are always and only 8 bits to a byte.
I've already replied to one person who quoted an ISO standard, that they clearly had not read, in support of this misconception.
That’s normally called a word. I’ve seen casual usage of “byte” in that context but formal documentation has always used word.
This stems from architectures that had both a both a minimal and maximal addressable "atomic chunk" via segmented architecture.
Conversations about this varied between Intel, Motorola, TI chipset users, PDP v Cyber programmers, etc.
From a Standard C Programming (as portable assembly) viewpoint Char and Byte are interchangable and are the smallest atomic type .. with the number of bits defined by a header constant.
[1] https://www.iso.org/obp/ui/#iso:std:iso-iec:2382:ed-1:v1:en
So no, it's not a contraction of "by eight". It's just a different spelling for "bite", as a pun on "a bite of bits".
But I've seen "byte" used to refer to other numbers of bits lots of times, although primarily in the old days. Not so much anymore. I think the meaning has changed with time and has become fixed on "8 bits", but it was not always so.
What systems are there with such large bytes?
They are fiends for pipelined throughput and have low level ASM statements that modular index through vector calculations per clock cycle (increment pointers, multiply and add arguments, store results, etc) but are built for the float | double numerical domain grunt work with no ground given for bithack twiddling.
There are other examples but that should suffice.
In the days when I learned what it was, it was fairly common for machines to have addressable units larger than a byte. We said that these machines "did not address to the byte". That is literally something you would say. Because a BYTE WAS NOT THE SAME THING AS AN ADDRESSABLE UNIT.
A word isn't the same thing as an addressable unit, either; your machine's word size is the number of bits it transfers over its main bus at one time and/or the number of bits it can work on in one operation. Which is maybe a bit fuzzy nowadays.
If I recall correctly, I read somewhere, back in the 1970s, that the word "byte" was originally used to mean the size of a text character. But in any common usage, even by the time I learned about it, it had settled down to just meaning 8 bits, period.