[libc] Implement a generic streaming interface in the RPC
Currently we provide the `send_n` and `recv_n` functions. These were
somewhat divergent and not tested on the GPU. This patch changes the
support to be more common. We do this my making the CPU provide an array
equal the to at least the lane size while the GPU can rely on the
private memory address of its stack variables. This allows us to send
data back and forth generically.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D150379