In Linux at least the system call mechanism works under most architectures by placing some specifically formatted data (usually some kind of c struct) in either some registers or predefined memory addresses.
The issue comes however in actually forcing the CPU to do the switch into kernel space so it can run the privileged kernel code to service the call. This is done by forcing a fault of some sort (a fault being a divide by 0, an undefined overflow or a segfault, etc) this forces the kernel to take over execution to handle the fault.
Normally the kernel handles faults by either killing the causing process or running a user supplied handler. However in the case of a syscall it will instead check the predefined registers and memory locations and if they contain a syscall request it will run that using the data provided by the user process in the in-memory struct. This usually has to be done with some specially hand crafted assembly and to ease the use of the syscall for the user the system's C library has to wrap it as a function. For a lower level interface please see http://man7.org/linux/man-pages/man2/syscall.2.html for some information on how syscalls work and how you can call then without a C wrapper.
This is given an oversimplification, it is not true in all architectures (mips has a special syscall instruction) and not necessarily working the same on all OSes. Still, if you have any comments or questions please ask.
Amended: Note, regarding your comment about things in /dev/ this is actually a higher level interface to the kernel, not a lower one. These devices actually use (about) 4 syscalls underneath. Writing to them is the same as a write syscall, reading a read syscall, open/closing them equivalant to the open and close syscalls and running an ioctl causes a special ioctl syscall which in itself is an interface to access one of the system's many ioctl calls (special, usually device specific calls with too narrow usage to write a whole syscall for them).