View previous topic :: View next topic |
Author |
Message |
guy
Joined: 21 Oct 2005 Posts: 297
|
Direct stack manipulation? |
Posted: Sun Aug 20, 2017 5:31 am |
|
|
MCU: PIC24FJ64GA308
I have code that works with a cellular modem and uploads a file in FTP. If for some reason (and there could be a dozen of those) I am stuck waiting for a response string, a watchdog timer will reset the chip. It works but it's not cool.
The code is divided into several functions, and error handling is hard and time consuming.
Is there a way to create a mechanism similar to an operating system, in which if a process hangs (does not restart a watchdog timer) the code returns to a point in the Main routine REGARDLESS of the stack? of course the stack should be sorted out or cleaned somehow so I can later go into Retry.
*Please don't suggest old-school structural programming techniques of error handling. I am interested in learning something new... |
|
|
temtronic
Joined: 01 Jul 2010 Posts: 9269 Location: Greensville,Ontario
|
|
Posted: Sun Aug 20, 2017 7:07 am |
|
|
re: ...
If for some reason (and there could be a dozen of those) I am stuck waiting for a response string,
If this is a 'serial' response, CCS does show a 'timed receive' function, I think in the FAQ section of the manual. I've used a version of it to allow the PIC to know when the host PC 'dies'. It's also applicable to say GSM modems or RFID modules where 'something' should besent from them but alas it takes too long. You should set the 'timeout' time to say 2X or 3X the known response time.
Jay |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19589
|
|
Posted: Sun Aug 20, 2017 8:54 am |
|
|
Realistically the only place you can go back to with the stack sorted out, is a restart.
This is where 'restart_cause' comes in.
Design your code so the main variables are static, and not initialised by the compiler (so no 'initialisation' values), and then test restart_cause. If it is the normal power on reset, initialise the variables yourself. If not, you are back into the 'main' without the variables being changed. This is how you can program for a watchdog, but equally you can test for a software reset, and use the reset_cpu instruction.
Similarly, you can also not initialise peripherals if required (use NO_INIT in the #use declarations), and only physically initialise these when you require. So you could (for instance) have two tests on restart cause, and if it is a reset_cpu, initialise nothing, but if it is a watchdog, initialise the hardware.
However it does come down to why your code itself does not exit tidily?. For instance, if you are calling things that read (like serial input), these can either be tested before reading, or have a timeout (as Temtronic says). |
|
|
newguy
Joined: 24 Jun 2004 Posts: 1911
|
|
Posted: Sun Aug 20, 2017 9:03 am |
|
|
I also must put forth a timer based "graceful exit". Very general code flow would be:
- I'm expecting a response within x seconds
- either start a new timer or add a "looking_for_response" flag to an existing timer's routine that will throw another flag "comms_timed_out". If the link times out, do whatever you need to do to gracefully exit/abandon looking for a response.
- in your serial comm routine, if you get "x" response (the one you're looking for), then set "looking_for_response" flag to FALSE. If you started a new timer, stop it and disable its interrupt. |
|
|
guy
Joined: 21 Oct 2005 Posts: 297
|
|
Posted: Sun Aug 20, 2017 10:27 am |
|
|
Thank you guys.
Ttelmah, your idea to restart is very creative. It's like making the whole main() into the main loop and at the beginning check restart_cause to initialize registers & peripherals. Nice!
I am not talking about timeouts in comm. Imagine you are waiting for an OK or ERROR string from the modem and with a timeout. No problem so far. But if you are waiting for a dozen of those in different parts of the code since the code includes several different commands. Each time the command & parsing is different, each test can lead to an error, and structural programming is not really built for that. In C# there is Try & Catch for that.
For PICs, goto is one option, but it only works inside the function (in other words, when the call stack is not involved). |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19589
|
|
Posted: Mon Aug 21, 2017 2:50 am |
|
|
I'd probably suggest it is tidier to make the main a 'wrapper'.
So create your own 'main_code' routine, and 'software_init'/'hardware_init' routines, and then just have the normal 'main' decide what to call.
It's important to understand 'why' this does not exist. It is fundamentally not part of C. C inherently does not have an ability to do this. You can _inside a routine_, generate a try/catch type mechanism (using setjmp and longjmp), but to make this come out multiple layers, requires you to completely control the stack. This will get very complex (you could generate a stack buffer table, and save W15 for particular depths of jump/restore), but the odds of getting it to work reliably are low....
The alternative, is to work the other way.
The hardware 'reset_cpu' instruction, explicitly resets the stack to the boot state. So treat this as the external 'master' call, which always takes you to the 'wrapper' function. Then split this function up:
Code: |
void main(void)
{
unsigned int8 cause;
cause=restart_cause();
switch(cause) {
case RESTART_POWER_UP:
case RESTART_MCLR:
//these need to initialise the hardware
hardware_init();
//drop through
case RESTART_TRAP:
case RESTART_ILLEGAL_OP:
case RESTART_TRAP_CONFLICT:
//These could be caused by invalid register values
//so initialise the registers as well
software_init();
//again drop through
case RESTART_WATCHDOG:
case RESTART_SOFTWARE:
//Now into the main code
main_code();
break;
}
//should never get here, force a software restart if I do
reset_cpu();
}
|
I have code that has to be able to be reset by a telephone call, to 'restart', but retain it's configuration. Since I'm talking to up to seven interfaces I needed a 'backstop' method of recovery, and this has worked well for me.
Key thing becomes how you declare and initialise the variables. Each routine maintains it's own local variables that are therefore re-initialised when they are called again from the outside, but the master configuration is in data structures that are re-loaded from an SD card in the 'software_init' routine. RESTART_SOFTWARE takes you past this, so carries on with the existing configuration data. |
|
|
guy
Joined: 21 Oct 2005 Posts: 297
|
|
Posted: Mon Aug 21, 2017 3:16 am |
|
|
Excellent example for future generations to come!
In my case it is a gateway that most of the time waits for wireless packets and once every 24h uploads the data to a server. Since the main loop is very simple, a reset will not be hard to handle. I will just make sure not to lose the data and time after a software/WDT reset.
Thanks! |
|
|
RF_Developer
Joined: 07 Feb 2011 Posts: 839
|
|
Posted: Mon Aug 21, 2017 4:18 am |
|
|
I think an important takeaway here is that when you are contemplating something like stack manipulation then somethings gone way too far, and there has to be another way.
My personal approach would be like Newguy's: use timers or a clock tick to implement timeouts, separating the sending of messages from dealing with responses, timeouts being handled in mainline code. I'm not so keen on the watchdog/restart approach. Either way, what you don't want to be doing is waiting, e.g. with delay_ms() inside routines.
I have used a setjump based approach for try-catch type code with some success but that was with a one-shot main where each run was independent. It was for a battery-powered Go/No Go test box where it ran a series of short tests when it was powered up by a push-button, displayed the results on LEDs and then switched itself off. The try-catch was entirely in main() and there was no loop. Worked great, and the boxes are still on the first set of batteries (4 x AA) after a couple of years but it wasn't implementing timeouts and there wasn't any attempt at multithreading, i.e. a timeout runs in another context, such as an interrupt. |
|
|
guy
Joined: 21 Oct 2005 Posts: 297
|
|
Posted: Mon Aug 21, 2017 5:44 am |
|
|
Quote: | Structured programming is a programming paradigm aimed at improving the clarity, quality, and development time of a computer program by making extensive use of subroutines, block structures, for and while loops—in contrast to using simple tests and jumps such as the go to statement, which could lead to "spaghetti code" that was difficult to follow and maintain. |
While this is true in some cases, some program logic structures such as state machines, try-catch sequences etc. create the opposite effect where the code is less readable, for example:
Code: | do this
if(!fail) {
continue
if(!fail) {
continue2
if(!fail) {
}
}
} |
instead of
Code: | do this
if(fail) goto failed
continue
if(fail) goto failed
continue2
if(fail) goto failed
...
failed: handle |
If you want the code to be clearer you would use functions instead of one long sequence, but the use of goto is not allowed between functions due to the stack usage.
I vote for creative programming as long as it's not dirty programming. |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19589
|
|
Posted: Mon Aug 21, 2017 6:34 am |
|
|
Yes. It is an important distinction.
Internally every routine and data layout should if possible follow structured procedures, but the presence of external 'trap' capabilities, can in some cases be much cleaner. Interrupt programming in particular is better done without getting hooked on structured programming.
This is of course why languages like C# have the try/catch abilities, and in a very real sense we already have a master trap ability in the watchdog. The 'restart from go' ability is inherent in the PIC instruction set, and using this carefully can in some cases save a lot of complexity. But, keyword, 'carefully'... |
|
|
guy
Joined: 21 Oct 2005 Posts: 297
|
|
Posted: Fri Sep 08, 2017 5:55 am |
|
|
I just found out that one of the newer PICs, PIC16F18855 (and others I suppose) have a special bit to indicate a reset caused by a Reset instruction.
PCON0 register,
Quote: |
bit 2 RI: RESET Instruction Flag bit
1 = A RESET instruction has not been executed or set to ‘1’ by firmware
0 = A RESET instruction has been executed (cleared by hardware)
|
Just what we need here! |
|
|
temtronic
Joined: 01 Jul 2010 Posts: 9269 Location: Greensville,Ontario
|
|
Posted: Fri Sep 08, 2017 8:08 am |
|
|
Something not mentioned here is the probability that variables in RAM may also be corrupted when the PIC visits 'Lala' land.
It's quite possible that stuff other than the stack will be 'modified',perhaps even pin directions, so doing a partial 'warm boot' so to speak, isn't maybe a good idea, rather a full 'reset' or 'hard' reboot to ensure variables get set to KNOWN values.
just something to ponder...
Jay |
|
|
guy
Joined: 21 Oct 2005 Posts: 297
|
|
Posted: Fri Sep 08, 2017 8:33 am |
|
|
Are you basing this on experience? IMHO if there is a special command for Reset and defined VDD for RAM retention etc. The whole scenario should be stable regarding RAM and SFRs. This is all documented. |
|
|
temtronic
Joined: 01 Jul 2010 Posts: 9269 Location: Greensville,Ontario
|
|
Posted: Fri Sep 08, 2017 9:07 am |
|
|
yes, Real World isn't always nice....had some bad crosstalk/EMI on an early project( 20 year ago) and PIC went to 'LALA' land, got 'hung up', and several variables in RAM were corrupted so I am leary about 'soft reboots' where not all SFRs, RAM, etc. get reinitialized to known values.
Maybe the new PICs are better but 'once bitten, twice shy'.
Jay |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19589
|
|
Posted: Fri Sep 08, 2017 12:35 pm |
|
|
The restart_cause function already tests that bit and will tell you that the system has been software restarted.
As others have said the caveat is you have to ensure that the startup sets up what needs to be setup.
I have a system that can have it's configuration changed from a file loaded from a server, triggered by a text message. Once this is received, it does reset_cpu. the code resets all the things like counters etc., to their 'boot' values, but does not re-initialise the other peripherals. However a reset caused by an error, very carefully does reset the peripherals in case one of these is what is causing this. |
|
|
|