In a similar vein, jart's Cosmopolitan libc has a really fun collection of tables that compare various constants across platforms, e.g. syscalls, syscall flags, error numbers, etc. It includes (variants of) Linux, XNU, NT, and the BSDs.
https://github.com/jart/cosmopolitan/blob/master/libc/sysv/c...
In the off chance you haven't heard of Cosmopolitan yet, I hope you find the discovery as much fun as I have.
I would imagine the "MacOS" bit is there to emphasize that the values haven't been verified on iOS.
I wonder why it displays without javascript in chromium, but fails to do so in firefox. If the author is here, could that be fixed?
Edit:
Restarting chromium and trying again yields the same behavior as firefox. I wonder if the javascript somehow slipped past umatrix & ublock origin on my first try. Given that I launched chromium by dragging the link onto its icon the first time, perhaps the script-blocking extensions weren't fully loaded?
Testing again several more times, that does seem likely. I can reproduce it intermittently by dragging the URL onto my chromium shortcut if chromium isn't already running.
Edit 2: Sure enough:
The html sourcecode shows the content of the table is not served within the page, but is added on page load through javascript by formatting a json requested file. I changed the browser's user-agent to chrome and the same page was served (though the link to git shows it's an static page). My guess is may be chromium is not disabling javascript.
PS: I'm a Firefox user too, I always browse with Javascript disabled by uMatrix (in addition to uBlockOrigin), I only enable it when the web deserves it and leaving disabled third domains js loads almost always.
Yes. Chrome will actually pause extensions loading to deliver you that first rendered picture fraction(arguable, probably slower in the end considering extensions load from disk) of a second faster. I think it was direct uBO sabotage.
TLDR: First website loaded by starting browser with a link or from last session has almost 100% chance of bypassing uBlockOrigin. Chrome and Chromium based browsers are not User Agents, they are Google Agents.
https://www.man7.org/linux/man-pages/man2/syscall.2.html
https://www.man7.org/linux/man-pages/man2/syscalls.2.html
There are also the manual pages for each individual system call.
The syscall numbers unfortunately cannot be found in the manual. They are found in published tables on the internet.
They are defined in numerous locations in the Linux kernel tree. They are included via the linux/unistd.h header which in turn includes the appropriate asm-generic/ and asm/ headers.
https://github.com/torvalds/linux/blob/master/include/uapi/a...
https://github.com/torvalds/linux/blob/master/tools/arch/x86...
https://github.com/torvalds/linux/blob/master/tools/arch/x86...
https://github.com/torvalds/linux/blob/master/tools/arch/arm...
There's quite a bit of complexity here. The system call numbers are stable for each architecture but may differ between architectures. Some architectures have multiple historical versions of the same system call which are maintained for backwards compatibility, others have just the latest version of the relevant system call with the version number removed.
I assume this complexity is the reason why this information is not typically included. People expect you to rely on the libc which abstracts all this.
Just because you can search the source, doesn't mean this wouldn't come in handy to someone some day.
Or is it just a quirk of history?
Coming from BSD, I find this very confusing and tend to grumble when I'm tracing something and have to go groveling from the right errno.h. Eg:
% find ~/linux/ -name 'errno*' | wc -l
22
% find ~/freebsd/sys -name 'errno*' | wc -l 1I vaguely remember reading somewhere that the MIPS ones are weird to support compatibility with the existing unix syscall numbering, but I can't find any evidence for that anywhere, so maybe it was aspirational or I'm hallucinating.
Edit: this answer has some relevant details:
https://stackoverflow.com/questions/63713056/why-is-their-a-...
I recently had enough of parsing the various syscall.h files on different architectures and wrote a debugfs syscall info reader instead. That way you can see all tracepoint-instrumented syscalls and arguments available exactly on your currently running kernel on your platform:
https://tanelpoder.com/posts/list-linux-system-call-argument...
Edit: changed "all" to "all tracepoint-instrumented" based on a comment below - some added syscalls don't (immediately) get instrumented with a tracepoint so tracefs wouldn't show them (until someone instruments them in a later kernel version as seems to be the case). The tracefs approach has been good enough for me, but the only 100% guaranteed way to see all currently available syscalls would be to read the syscall table from kernel memory and see which syscall handler kernel functions they call (as the syscall name itself is meaningless inside the kernel).
(x32 is rare https://en.wikipedia.org/wiki/X32_ABI )
I've been using other tables but they were always incomplete and often x86_64 only. This one contains everything: number, symbol, links to kernel implementation, signature, user space ABI registers. And I can select kernel version, kernel binary interface and processor architecture!
I'm very interested in how you are collecting or generating all this information. Please post details on the process. I need similar information in order to compile system call tables into lone, my own programming language which features direct Linux system call support.
I use scripts that parse the information out of Linux user space API headers: the compiler prints all the preprocessor definitions from linux/unistd.h, the "SYS_" definitions are selected and then turned into a C array initializer for a number/name structure.
# makefile
$(call source_to_object,source/lone/lisp/modules/intrinsic/linux.c): $(targets.NR.c)
$(targets.NR.c): $(targets.NR.list) scripts/NR.generate
scripts/NR.generate < $< > $@
$(targets.NR.list): scripts/NR.filter
$(CC) -E -dM -include linux/unistd.h - < /dev/null | scripts/NR.filter > $@
# scripts/NR.filter
grep __NR_ | sed 's/#define //g' | cut -d ' ' -f 1
# scripts/NR.generate
# generates C array initializers like:
# { "read", __NR_read },
while read -r NR; do
printf '{ "%s", %s },\n' "${NR#__NR_}" "${NR}"
done
// source/lone/lisp/modules/intrinsic/linux.c
static struct linux_system_call {
char *symbol;
lone_lisp_integer number;
} linux_system_calls[] = {
/* huge generated array initializer
* with all the system calls found
* on the host platform
*/
#include <lone/lisp/modules/intrinsic/linux/NR.c>
};I realized soon in the process that simply looking at kernel sources was not enough to extract everything accurately, specially definition locations. I also wanted this to be a tool to extract syscalls actually implemented from a given kernel image, so that's what it does.
Your approach should be fine, that is what any other language does basically: rely on uapi headers provided by the kernel (just beware that some may be generated at build time inside e.g. include/asm/generated/xxx). You should rely on the headers that are exported when you do `make headers_install`. Also, make sure to have a generic syscall() function that takes an arbitrary syscall number and an arbitrary amount of args to make raw syscalls for the weird ones you don't easily find in uapi headers and you should be good. After all, even in the C library headers some of the "weird" syscalls aren't present sometimes.
len_in being the passed argument and len being the page aligned len.