|
|
View previous topic :: View next topic |
Author |
Message |
C Turner
Joined: 10 Nov 2003 Posts: 40 Location: Utah
|
RAM Re-use and "scratch" variables |
Posted: Fri Jan 03, 2014 5:00 pm |
|
|
I have a head-scratcher:
Using a 16F1847 along with the current V5 compiler, I have a project that does two separate things:
- Tone generation, etc.
- DSP filtering of audio.
When operating, the processor is doing one thing OR the other.
The 1k of RAM is overkill for the tone generation, but I actually need to store 625 samples of audio: I managed to do this with 12 bit RAM storage (each sample contains 10 bits of audio - plus an extra 2 bits of LSBs) so I end up using about 940 bytes of RAM for that. This amount of RAM is "non-negotiable" owing to the nature of the DSP algorithm and the sampling rate and I don't have enough CPU time between samples to do the shifts, etc. to "scrunch" the RAM down even more (e.g. use 10 bits of ram per sample instead of 12 bits.)
Once all of the pointers, counters, etc. are taken into account, I use about 1015 bytes of RAM out of the 1024 (this doesn't include the 16 bytes of "shared" RAM - which is also fully utilized.)
Here's the problem:
I can declare the 940 byte audio buffer as local variables just fine and if all I'm doing is the DSP code, I use just 1015 bytes as noted above. (All other variables are made local as much as possible as well.)
When I add the other code for audio tone generation I run out of RAM since the compiler *insists* on reserving scratch RAM for functions that I'm not even using when I'm doing DSP. In other words, the very existence of those functions means that scratch RAM is permanently allocated as "global" variables behind my back.
A specific example of this is that the other functions (e.g. tone generation) use some 32 bit math for frequency calculation (the DSP does not)- but the compiler insists on dedicating scratch RAM to the 32 bit math no matter what, so I end up running out of RAM for the DSP functions. (I could write my own 32 bit math functions that used only "local" variables, but that would be a PITA...)
At the moment I've worked around this by doing a few things, including:
- Using the #LOCATE to assign RAM locations to global variables. This is mostly done to speed up the code and minimize bank switching: On the real-time DSP, I only have 2-3 instruction cycles of overhead left between samples, but it also allows me to re-use the RAM for global variables that are needed for the DSP - but not needed for the tone generation and vice-versa.
- Located a number of the variables in the registers of unused hardware (e.g. timers 4 and 6, etc.)
Doing the above I've managed to keep about a dozen bytes of free RAM.
Is there a way to convince the compiler to *not* use what effectively are "invisible" global variables (e.g. the scratch RAM) for some of these functions?
Thanks,
CT |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19587
|
|
Posted: Sat Jan 04, 2014 4:28 am |
|
|
The compiler does normally re-use the areas used for this type of scratch. It suggests that the compiler thinks that the possibility does exist that the two lots of code could be called inside one another. This would be down to how it is laid out.
However there are other things that might lead to this behaviour. For instance int16 multiplication, is used when the compiler is asked to calculate the location of something in an array. Not int32, but you get the idea that sometimes there are 'non obvious' uses of functions....
Also the generic 'scratch' area will be re-used by most compiler functions, and is a reserved area. In your case the area at 077, is reserved for the compiler scratch, which is in the 'all banks' memory area.
However the big problem, is that you seem to be overestimating how much ram the chip has. The 1024 bytes is the total _including_ the 16byte shared area. There are 12 pages, each with 80 bytes, one page with 48 bytes, and the shared area. 1008 bytes, plus the 16 shared bytes. If the 16 shared bytes are 'fully utilised', you do not have the space for the 1015 bytes required.... I don't think any amount of juggling is going to make it fit.
So I think you are 'out of memory', before you start.
Best Wishes |
|
|
asmboy
Joined: 20 Nov 2007 Posts: 2128 Location: albany ny
|
|
Posted: Sat Jan 04, 2014 11:09 am |
|
|
Quote: |
- Tone generation, etc.
- DSP filtering of audio.
|
you could benefit from looking at 18F parts......
just my 2 cents |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19587
|
|
Posted: Sat Jan 04, 2014 12:36 pm |
|
|
Agreed. Especially when you consider that multiplication is typically 8* faster given the hardware multiply instruction..... |
|
|
C Turner
Joined: 10 Nov 2003 Posts: 40 Location: Utah
|
|
Posted: Sat Jan 04, 2014 5:11 pm |
|
|
The math used in the DSP to just shifts and adds, so the 18F part really would not be of much benefit from the standpoint of increased execution time: More RAM would be of help, of course, but one of the limitations is the physical footprint: There isn't really anything in the 16/18 family that has more RAM in this size of package - unless something new was just released recently...
(The dsPIC family would be the natural choice for that sort of thing...)
* * *
As far as allocating scratch variables, it would seem as though the CCS compiler - at least when compiling for the '1847 - does *not* re-use as well as it could: As I mentioned before, if you were to invoke a 32 bit multiply somewhere *else* in the code, the compiler seems to permanently allocate scratch RAM to it, exclusively, effectively making it yet another global variable.
* * *
To be sure, there *is* enough RAM in this processor to do what I need it to do.
What I was asking was if there was some clever way to *prevent* the compiler from doing what amounts to allocating global variables behind my back, but it would appear that the answer is "No." |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19587
|
|
Posted: Sun Jan 05, 2014 2:52 am |
|
|
I still suspect layout.....
Key is to think about how the code is organised.
For instance:
Code: |
main()
loop
routine()
other_codeinmain
endofloop
|
will result in all scratch variables in 'other_codeinmain', being reserved when you are in 'routine'.
However:
Code: |
main()
loop
routine()
other_code()
endofloop
|
Doesn't.
I've just put together a dummy program in 5.016, with a load of 32bit maths in one routine, and a second set of fp routines in another, both using large arrays, laid out as in the second example, and with each using more than 80% of the available RAM on your chip, and it merrily compiles. It re-uses the 32bit mul3232.scratch locations for part of the data array in the second routine....
Best Wishes |
|
|
C Turner
Joined: 10 Nov 2003 Posts: 40 Location: Utah
|
|
Posted: Sun Jan 05, 2014 12:05 pm |
|
|
Very good point about the organization of the code - and were it organized that way, that would explain it: Although *I* know that those RAM locations aren't going to conflict if re-used in another main() function, there's no way that the compiler can "know" this.
In looking at the code, it was organized in the way that you suggest, but with further inspection, I think that I found the problem:
Since I'm using an ISR, it looks as though the compiler allocates some scratch RAM to save some variables while in the ISR - in this case, those happen to be associated with the 32 bit math. Interestingly, the code in the ISR "saves" those locations (also using more "global" RAM) even though nothing in the ISR would have caused a conflict with those locations in the first place!
What I ended up doing was writing my own interrupt handler (pretty easy on the '1847 with its shadow registers - and also that I have only one interrupt source) - and this also saved some overhead and I regained a few a dozen or so cycles per ISR cycle for my DSP as well. I don't quite understand why the compiler did what it did in the first place...
Thanks for taking a look!
CT |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19587
|
|
Posted: Sun Jan 05, 2014 12:27 pm |
|
|
Now you mention an ISR....
Yes, the compiler defaults to saving _everything_ that would cause problems. It is an area where a little thought by the author's would save a lot of hassle. So if (for instance) they simply generated a list of everything 'touched' in the interrupt handler(s), then made the save and restore just save the items in the list, it'd be a huge improvement.
Best Wishes |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|