Just for the edification of the masses:
pthread_create(3) looks something like this:
clone(child_stack=0x7f79d754bff0,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x7f79d754c9d0,
tls=0x7f79d754c700, child_tidptr=0x7f79d754c9d0) = 31230
(Newlines are mine.)
Of course, those pointers are only that size on a 64-bit architecture. The flags are where the real point of interest is.
fork(2) is like this:
clone(child_stack=0,
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x7f543a9aaa10) = 17978
When implementing fork(2), the return value from clone(2) is the child's PID from the context of the parent process. When implementing pthread_create(3), the return value for the parent is still an integer value which is unique to the thread, and which strace uses as if it were a PID when it's tracing down the system calls of individual threads in separate files, which strace can do because it's awesome.
Some more information:
> Linux has a unique implementation of threads. To the Linux kernel, there is no concept of a thread. Linux implements all threads as standard processes. The Linux kernel does not provide any special scheduling semantics or data structures to represent threads. Instead, a thread is merely a process that shares certain resources with other processes. Each thread has a unique task_struct and appears to the kernel as a normal process (which just happens to share resources, such as an address space, with other processes).
http://www.makelinux.net/books/lkd2/ch03lev1sec3