No doubt, the kernel itself is also quite complex… but my comment here is on the user experience perspective, namely, for me at least “it just works”. So I’m not trying to imply it will work for anybody flawlessly nor that it’s due to the simplicity of the stack, solely that it works, for me.
In case others are interested on the general compute aspect, e.g inference for self hosted AI, here is something related I found :