Sorry, you need to enable JavaScript to visit this website.

Single baremetal project using both ARM cores

Unsolved
8 posts / 0 new
uzleo's picture
uzleo
Junior(0)
Single baremetal project using both ARM cores

Hi, I hope all are fine. I have a working video processing design on zedboard in which each alternate frame is being processed on a separate core (even frame on CPU0; odd frame on CPU1). Both the CPUs run bare-metal applications of their own and have nearly identical codes in them. I was wondering if it is possible to remove this redundancy and somehow develop a single baremetal application (say on CPU0) that uses both cores to do the same task.
One possible implementation that I can think of is to have CPU0 interrupt the other CPU for each alternate frame and the CPU1 ISR should be associated to the common function in CPU0 codespace (all in CPU0 SW) so that no separate application project for CPU1 is needed. Is this possible?? Or can you guys provide some other ideas to solve this problem. Thanks

hockeyman1972's picture
hockeyman1972
Junior(11)
Maybe more trouble than it's worth?

Hi uzleo,
  Sounds like what you are trying to do is replicate the SMP functionality encapsulated in some operating systems, such as Linux.  However, I'm not aware of any ability to target a particular CPU with finer than a project granularity, which is what you would require in order to split functions in a single project between cores.  In a single baremetal project, you normally have a single thread of execution, and that is tied to a single core. 
Ron

uzleo's picture
uzleo
Junior(0)
Hi Ron,

Hi Ron,
Thanks for the advice. I am also thinking that without advanced multicore debugging techniques it is "maybe more trouble than it's worth".
Anyway, I tried a hack by using the knowledge that CPU1 awakes by issuing SEV from CPU0 but instead of writing a seperate baremetal application start-address at 0xfffffff0, I wrote a function pointer pointing to the common function in cpu0 application. At the end of the function if the core executing is CPU1 then it will execute the jump instruction back to initial sleep state. It works!!

josevi's picture
josevi
Junior(0)
This is amazing! So you have

This is amazing! So you have Core 1 wake up, pull data from the shared memory and run a function from Core 0's code, than go back to sleep?

How do you synchronize their data access and storage?

I am very interested in this. Can you maybe post your wake up function call so that I can see how you did this all?

uzleo's picture
uzleo
Junior(0)
Hi josevi,

Hi josevi,
The nature of my application is such that there is no overlap between data accesses from both cores simultaneously so that's why there is no special requirement for synchronization right now and I haven't implemented any. I know it is not standard programming practice but as I have mentioned it is a hack!
Anyway I am posting some code as I dont completely get what you are asking for. Hope this helps:

so initially after setting up platform stuff, i use:
// initializing cpu1
Xil_Out32(0xfffffff0, (u32) funcPtr_CPU1init);
dmb(); // Wait until memory write has finished.
sev();

where funcPtr_CPU1init is declared earlier as:
void (*funcPtr_CPU1init)() = CPU1_init;

The CPU1_init function is going to be executed by CPU1. It simply duplicates the initial WFE code (the one BOOTROM writes to 0xffffff00) and then move CPU's PC to that code. This should not be needed but I noticed that somehow the code at 0xffffff00 got overwritten during runtime and so CPU1 sort of crashes. Keeping this code in seperated DDR memory location solved the problem:

void CPU1_init() {
Xil_Out32((u32) 0x06000000, (u32) 0xe3e0000f);
Xil_Out32((u32) 0x06000004, (u32) 0xe3a01000);
Xil_Out32((u32) 0x06000008, (u32) 0xe5801000);
Xil_Out32((u32) 0x0600000c, (u32) 0xe320f002);
Xil_Out32((u32) 0x06000010, (u32) 0xe5902000);
Xil_Out32((u32) 0x06000014, (u32) 0xe1520001);
Xil_Out32((u32) 0x06000018, (u32) 0x0afffffb);
Xil_Out32((u32) 0x0600001c, (u32) 0xe1a0f002);

asm volatile ("bx %0" : : "r" (0x06000000));
}

So now CPU1 is also started up and waiting for SEV from CPU0. Now whenever I want CPU1 to execute some function I simply write its pointer to 0xfffffff0 and issue SEV like shown above for init function. The only catch is to identify the executing processor at the end of that function. For that I used:

u32 reg_val;
asm volatile ("mrc p15,0,%0,c0,c0,5\n" : "=r" (reg_val));
if (reg_val == 0x80000001) {
asm volatile ("bx %0" : : "r" (0x06000000));
}

which is simply reading the MPIDR register of ARM and if its LSB is '1' (cpu1) then its branching back to 0x06000000 but if its LSB is '0' (cpu0) then it will simply use the usual return instruction

Anyways I have noticed that there are some issues w.r.t cache structures on CPU1 as its execution appears to be relatively slow in case of standalone AMP case. I assume that has something to do with not executing the startup code provided by the BSP (boot.S, cpu_init.S, xil_crt0.S etc). So right now looking for ways to initialize CPU1 having similar configurations to that of standalone AMP case

uzleo's picture
uzleo
Junior(0)
also one has to reset the

also one has to reset the CPU1 stack pointer at the end of its function execution (before jumping to 0x06000000 in above code) to ensure that its stack doesn't overflow at runtime

divcesar's picture
divcesar
Junior(0)
Hi uzleo,

Hi uzleo,
I understand that it's been a long time, but would you mind sharing the source code for this? I am having some trouble understanding where each code snippet should go.
 
Thanks,

anthonyp's picture
anthonyp
Junior(0)
Freeware SMP RTOS

Check out the completely free SMP RTOS from Code Time Technologies: http://code-time.com/smp_free.html

The user guide is here: http://code-time.com/pdf/mAbassi%20-%20User's%20Guide.pdf