Hi,
I have experienced delays in the SPI bus. To transfer one byte into SPI bus, the system needs 110us so there is a considerable delay between bytes transferred (57 times than the period of one byte at bitrate 4MHz) with the Arduino SPI class.
I have tested the delay in changing the state of a pin (high to low) to make an alternate bus by software. The real surprise came with the 330ns per instruction that are needed for to change the state of one pin!!!. It became a system running at 12MHz (very slow for a CPU at 500Mhz and a microcontroller at 100 MHz).
Why a lot of clock cycles are needed to execute a basic ASM instruction?
Happy new year 2015