[VTA] Runtime refactor to allow for non-shared memory FPGAs (e.g. F1) (#3554)
* updated runtime to support non-shared memory FPGAs for instruction and micro-op kernels
* adding driver-defined memcpy function to handle F1 cases
* refactor to include flush/invalidate in memcpy driver function
* update tsim driver
* bug fixes
* cleanup
* pre-allocate fpga readable buffers to improve perf
* fix
* remove instruction stream address rewrite pass for micro op kernels
* fix:
* white spaces
* fix lint
* avoid signed/unsigned compilation warning
* avoid signed/unsigned compilation warning
* fix
* fix
* addressing comments
* whitespace
* moving flush/invalidate out of memmove
* clearnup
* fix
* cosmetic
* rename API
* comment fix