|
|
View previous topic :: View next topic |
Author |
Message |
allenhuffman
Joined: 17 Jun 2019 Posts: 554 Location: Des Moines, Iowa, USA
|
CCS - duplicate global variables allowed? |
Posted: Thu Dec 19, 2019 10:21 am |
|
|
We stumbled into a previous issue where duplicate function names did not generate a compiler warning or error. Even using the keyword "static" did not help.
I did a quick test with global variables, and found that duplicates are also allowed with no warning or error.
If there are duplicates, the code will use the one from the earliest C file specified in the Multi Unit Compilation file.
Code: | // main.c
#include <main.h>
//int g_value = 0;
extern int g_value;
int main()
{
printf ("main()\r\n");
printf ("main - g_value = %d\r\n", g_value);
foo ();
bar ();
printf ("main - g_value = %d\r\n", g_value);
while(TRUE)
{
//TODO: User Code
}
} |
Code: | #include <main.h>
int S_value = 100;
void foo()
{
printf ("foo - g_value = %d\r\n", g_value);
} |
Code: | #include <main.h>
int g_value = 200;
void bar()
{
printf ("bar - g_value = %d\r\n", g_value);
} |
foo() and bar() both have a global called g_value, each initialized to different values (100 and 200).
If I add them the multi-unit compile in the order of main, foo and bar, I get this:
main, foo, bar:
main()
main - g_value = 100
foo - g_value = 100
bar - g_value = 200
main - g_value = 100
main, bar, foo:
main()
main - g_value = 200
foo - g_value = 100
bar - g_value = 200
main - g_value = 200
Fun, eh? _________________ Allen C. Huffman, Sub-Etha Software (est. 1990) http://www.subethasoftware.com
Embedded C, Arduino, MSP430, ESP8266/32, BASIC Stamp and PIC24 programmer.
http://www.whywouldyouwanttodothat.com ? |
|
|
temtronic
Joined: 01 Jul 2010 Posts: 9243 Location: Greensville,Ontario
|
|
Posted: Thu Dec 19, 2019 12:29 pm |
|
|
this...
If there are duplicates, the code will use the one from the earliest C file specified in the Multi Unit Compilation file.
makes sense to me, though I don't know the 'ins and outs' of C or compilers...
I look at it this way...
The compiier assigns or relates the variable 'g_value' to a section of RAM within the PIC memory map, say to RAM location or register 0x0123. All later or subsequent references to 'g_value' will be redirected or assigned to that location 0x0123.
That is no different than having in main()...
g_value=00;
....code
g_value=01;
...more code
g_value=02;
There aren't 3 different variables called g_value, just one where the data is changed In my case from 00 to 01 to 02.
What's disturbing to me is that apparently in C++ you can have 3 different functions named the same, where passing the same data( say 0x55) will result in 3 different answers ! How it could keep track of what type of data you're passing, surely has to increase the 'overhead' and program size ?? |
|
|
jeremiah
Joined: 20 Jul 2010 Posts: 1354
|
|
Posted: Thu Dec 19, 2019 12:42 pm |
|
|
allenhuffman wrote: |
Fun, eh? |
That's actually somewhat normal. The C language allows for multiple variables in different compilation units to be named the same as long as their symbols don't overlap. The ANSI language spec doesn't define HOW to do that, and leaves that up to the implementation. In ANSI-C, it doesn't define anything past what it calls the "abstract machine", and this is an implementation detail below that level.
In GCC, you do that by declaring either foo or bar as static and leaving the other non static. In that case, the extern will follow whichever g_value is NOT static (and thus has a public symbol).
CCS doesn't use the same type of linker (if you can even call it a true linker), so it is free to implement that how it likes. In its case, it chooses to pick the first one that it finds, and it is allowed to do so. That said, it could still be a bug if they didn't intend for their implementation to act that way. I would recommend an email to them asking about it.
Note that even GCC uses ordered linking for some things. I've been able to supply certain .o files to it that already exist and if mine are first in the linking order, it takes those. I'm assuming the supplied ones are defined with a linker alias and weak attributes, but don't know.
Last edited by jeremiah on Thu Dec 19, 2019 1:07 pm; edited 1 time in total |
|
|
jeremiah
Joined: 20 Jul 2010 Posts: 1354
|
|
Posted: Thu Dec 19, 2019 12:53 pm |
|
|
temtronic wrote: |
What's disturbing to me is that apparently in C++ you can have 3 different functions named the same, where passing the same data( say 0x55) will result in 3 different answers ! How it could keep track of what type of data you're passing, surely has to increase the 'overhead' and program size ?? |
they generally handle it at the linking stage so it doesn't take any more code than if you used 3 differently named functions. The spec doesn't specify how to do this (again, implementation detail...makes C++ ABI's very interesting to deal with when there is no spec on this). But most compilers do some form of "name mangling" at the linker level. When they translate it to object code, in reality it's just addresses (not actual names). But they also generate a "symbol table" which pairs addresses with symbol names. This doesn't take up any code space...it's only used to generate the EXE. In the symbol table, they name each function based on the declaration parameters and name, so you might see 3 separate symbols in the table: fooXyzz1, fooZyzz1, fooMyzz1, where the Xyzz1, Myzz1, and Zyzz1 are all symbols for separate types of parameter lists (I made those up, but the actual symbols look like that kind of stuff).
The patterns they generate for the names are all deterministic and are not too dissimilar from creating a "hash" of the function name and parameter list. So the separate object files keep track of the physical address of each function, and the symbol table keeps track of the symbol name and which address is associated with it. When they go to generate code for main, it will reference "symbol names" for any external symbols, look them up in the symbol table, and then get the address and replace the call in main to use that instead of the symbol.
Take the following code compiled with GCC 9.1.0:
Code: |
typedef struct{
int value;
} t1;
typedef struct{
int value;
} t2;
typedef struct{
int value;
} t3;
void foo(t1 v, int i){}
void foo(t2 v, int i){}
void foo(t3 v, int i){}
int main(void){
t1 v1;
t2 v2;
t3 v3;
foo(v1,2);
foo(v2,2);
foo(v3,2);
return 0;
}
|
if I dump the symbol table for that program, I can see all 3 foos there:
Code: |
0000000000401510 T _Z3foo2t1i
000000000040151d T _Z3foo2t2i
000000000040152a T _Z3foo2t3i
|
I forget what the Z3 specifically stands for, but it has to do with the void return type, foo is the name, 2 is the number of paramters, i is for integer, then there is t1/t2/t3 |
|
|
Ttelmah
Joined: 11 Mar 2010 Posts: 19538
|
|
Posted: Thu Dec 19, 2019 1:07 pm |
|
|
CCS does support overloaded functions, and they work. BUT. The variables
used have to be different. So use an int8 and an int16, and two different
functions of the same name can exist and be used. What it doesn't do
is distinguish aliases for the same physical variable type.... |
|
|
allenhuffman
Joined: 17 Jun 2019 Posts: 554 Location: Des Moines, Iowa, USA
|
|
Posted: Thu Dec 19, 2019 2:03 pm |
|
|
Since we accidentally stumbled upon this, it is very good to be aware of since there are (sadly) a ton of global variables being used in existing code.
And yea, C++ generates tons of overhead compare to C. I never learned C++ for that reason, since non of the embedded jobs I have ever had would use it due to code bloat. _________________ Allen C. Huffman, Sub-Etha Software (est. 1990) http://www.subethasoftware.com
Embedded C, Arduino, MSP430, ESP8266/32, BASIC Stamp and PIC24 programmer.
http://www.whywouldyouwanttodothat.com ? |
|
|
jeremiah
Joined: 20 Jul 2010 Posts: 1354
|
|
Posted: Thu Dec 19, 2019 7:06 pm |
|
|
allenhuffman wrote: | And yea, C++ generates tons of overhead compare to C. I never learned C++ for that reason, since non of the embedded jobs I have ever had would use it due to coat bloat. |
Not necessarily. It can if you use certain features in a way not tailored towards embedded, but all of the embedded C++ applications I have done on my arm chips have not generated any bloat and are comparable to C programs (I verified with the assembly). Essentially, you want to avoid exceptions and the standard library completely and if you plan on using templates or lambdas, you need to use some design patterns. All the remaining C features can be used with no real bloat or cost. Scott Meyers (a well know C++ author) has done quite a few books and presentations on the subject.
And now as of C++ 11/14/17, C++ code is often times smaller than equivalent C due to constexpr and some other features that offload processing to the compiler instead of your code. Heck, I've done code using virtual functions that was smaller than the equivalent in C using case statements.
Now if you program C++ on an OS, it can bloat up, especially since applications for OS'es (.exe's for example) tend to put a lot more stuff in the binary than just the actual code, but you generally don't do that for embedded bare metal binaries (and honestly most compilers do give tools to strip out that stuff if you don't want it).
EDIT: here's an example: I created a templated static class that represents the EIC (External Interrupt Controller) on my ATSAML21 board and used it to clear the EIC interrupt flag and increment a global counter variable:
Code: |
// clear the interrupt handler
408: 2204 movs r2, #4
40a: 4b03 ldr r3, [pc, #12] ; (418 <EIC_Handler+0x10>)
40c: 60da str r2, [r3, #12]
// post increment the global
40e: 4a03 ldr r2, [pc, #12] ; (41c <EIC_Handler+0x14>)
410: 6813 ldr r3, [r2, #0]
412: 3301 adds r3, #1
414: 6013 str r3, [r2, #0]
// leave the interrupt
416: 4770 bx lr
// data
418: 40002408 .word 0x40002408 ; Register address
41c: 2000044c .word 0x2000044c ; global variable address
|
I rewrote it in straight C and it was identical. It's more verbose than PIC24 assembly, but the cortex M0+ doesn't have a large instruction set. But both C and C++ generated the same code. A lot of the magic comes from things like constexpr and using static polymorphism (though now that you can write constexpr constructors, standard polymorphism is also pretty versatile).
I will say it takes more background in C++ to know how to write embedded code, but using C++ doesn't have to mean bloat. |
|
|
allenhuffman
Joined: 17 Jun 2019 Posts: 554 Location: Des Moines, Iowa, USA
|
|
Posted: Fri Dec 20, 2019 8:26 am |
|
|
jeremiah wrote: | Not necessarily. It can if you use certain features in a way not tailored towards embedded, but all of the embedded C++ applications I have done on my arm chips have not generated any bloat and are comparable to C programs (I verified with the assembly). Essentially, you want to avoid exceptions and the standard library completely and if you plan on using templates or lambdas, you need to use some design patterns. All the remaining C features can be used with no real bloat or cost. Scott Meyers (a well know C++ author) has done quite a few books and presentations on the subject. |
I'd like to find a book on embedded programming in C++. It's used an awful lot in job postings around here, but "embedded" these days can mean an ARM processors with megs of RAM ;-)
The main thing I'd like to use it for is doing basic object oriented stuff. How bad is the overhead for that? (Does the CCS compiler go that far with C++ stuff, or just allow function overloads?)
Like, it sure would be handy to have the object contain all its private variables, and get to it with:
obj.Open()
obj.Update()
obj.Close()
...instead of having to use static globals in that C file. _________________ Allen C. Huffman, Sub-Etha Software (est. 1990) http://www.subethasoftware.com
Embedded C, Arduino, MSP430, ESP8266/32, BASIC Stamp and PIC24 programmer.
http://www.whywouldyouwanttodothat.com ? |
|
|
jeremiah
Joined: 20 Jul 2010 Posts: 1354
|
|
Posted: Fri Dec 20, 2019 3:42 pm |
|
|
allenhuffman wrote: | jeremiah wrote: | Not necessarily. It can if you use certain features in a way not tailored towards embedded, but all of the embedded C++ applications I have done on my arm chips have not generated any bloat and are comparable to C programs (I verified with the assembly). Essentially, you want to avoid exceptions and the standard library completely and if you plan on using templates or lambdas, you need to use some design patterns. All the remaining C features can be used with no real bloat or cost. Scott Meyers (a well know C++ author) has done quite a few books and presentations on the subject. |
I'd like to find a book on embedded programming in C++. It's used an awful lot in job postings around here, but "embedded" these days can mean an ARM processors with megs of RAM ;-)
The main thing I'd like to use it for is doing basic object oriented stuff. How bad is the overhead for that? (Does the CCS compiler go that far with C++ stuff, or just allow function overloads?)
Like, it sure would be handy to have the object contain all its private variables, and get to it with:
obj.Open()
obj.Update()
obj.Close()
...instead of having to use static globals in that C file. |
Scott Meyers retired, so he no longer does the class I was talking about, but he does sell his presentation materials from the class:
https://www.artima.com/shop/effective_cpp_in_an_embedded_environment
And for abstractions where you want the organization of classes but don't need actual objects (like peripherals for example), also consider static polymorphism:
https://www.youtube.com/watch?v=k8sRQMx2qUw
Note that you can still use objects coupled with constexpr and good optimizing C++ compilers. These are just options.
Side note, my arm processor that I program both C++ and Ada on has about 32k RAM. I only tend to use 2-8k of that for most programs.
Just for kicks, here is some mixed C and C++ compiled on GCC 9.2
Code: |
// Type your code here, or load an example.
class Test{
public:
constexpr Test();
constexpr void Set(int a);
constexpr int Get() const;
private:
int m_value;
};
constexpr Test::Test() : m_value(0) {}
constexpr void Test::Set(int a){m_value = a;}
constexpr int Test::Get() const {return m_value;};
extern "C" void Set(int a);
extern "C" int Get();
static int s_value = 0;
void Set(int a){ s_value = a; }
int Get() {return s_value;}
void something(void) {
Test v1;
v1.Set(32);
Set(v1.Get());
}
|
You can see the use of the class for variable v1, calling the setter for it and then calling the getter. Using the -O2 flag (for max optimization), the assembly generated by the code inside the something() function is merely:
Code: |
something():
mov DWORD PTR s_value[rip], 32
ret
|
All of that C++ code got replaced with just "32". I don't know how long godbolt stores links, but here is a link to it compiled on godbolt
https://godbolt.org/z/FyNwmt |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|