Accessing dynamic memory by array notation

halfro
Posts: 18
Joined: Sat Jul 15, 2017 11:13 am

Accessing dynamic memory by array notation

Postby halfro » Wed Aug 09, 2017 10:12 pm

I have a variable that is dynamically allocated at runtime, meant to store utf8strings. The utf8strings are converted from a byte sequence received via BLE from a user. The variable in this example here is named foo:

Code: Select all

uint32_t foo=(uint32_t*)malloc(sizeof(uint32_t)*utf8string_length);


Once I have data in the variable that has been converted from the byte sequence, When I try to access the utf8 characters in an example loop, I can access by:

Code: Select all

for(int k=0; k<stringlength; k++)
{
	foo[k*sizeof(foo)];
}


But I can't access it simply by:

Code: Select all

for(int k=0; k<stringlength; k++)
{
	foo[k];
}


Is this a bug in xtensa-gcc or is this implementation specific for xtensa-gcc? or am I just wrong with my approach?

Thanks.

User avatar
kolban
Posts: 1683
Joined: Mon Nov 16, 2015 4:43 pm
Location: Texas, USA

Re: Accessing dynamic memory by array notation

Postby kolban » Wed Aug 09, 2017 11:47 pm

I'm not seeing where the "uint32_t" is coming into play. A uint32_t is an "unsigned 32 bit integer". How does that relate to a character string?

Normally I associate a character string with a "a pointer to the start of the string" ... which is normally a "char *".

So, imagining I want to create a string of length utf8string_length, I would code:

Code: Select all

char *foo = (char*)malloc(utf8string_length);
Then I would populate the string buffer.

To then walk through each character in the buffer, I would code:

Code: Select all

for (int k=0; k<stringlength; k++) {
   char currentCharacter = foo[k];
}
Free book on ESP32 available here: https://leanpub.com/kolban-ESP32

halfro
Posts: 18
Joined: Sat Jul 15, 2017 11:13 am

Re: Accessing dynamic memory by array notation

Postby halfro » Thu Aug 10, 2017 12:17 pm

Thanks for the reply @Kolban.

Sorry, This was meant to be a pointer variable to an unsigned 32 bit integer, not an actual 32 bit unsigned integer:

Code: Select all

uint32_t* foo=(uint32_t*)malloc(sizeof(uint32_t)*utf8string_length);
The pointer points to the start of a sequence of UTF-8 characters in heap memory. A UTF-8 code-point can occupy a value from U+0000 to U+10FFFF. Reference: http://utf8-chartable.de/

This is thus beyond the range of a 16 bit unsigned integer, which would qualify a code point to be stored in a 32 bit unsigned integer.

As an example use case: The user enters the copyright symbol ©. This is send via BLE and this shows up as the UTF-8 encoded byte sequence 0xC2A9, There is a UTF-8 decoder running on the ESP32 that maps/decodes the UTF-8 encoded sequence 0xC2A9 to the UTF-8 codepoint U+00A9. This is then displayed on an display device.

In your example:

Code: Select all

for (int k=0; k<stringlength; k++)
{
   char currentCharacter = foo[k];
}
Your sample code will work quite well with a sequence of elements a byte wide such as uint8_t,int8_t, char but will not work quite as well with element sizes greater than a byte such as uint16_t, uint32_t.

Here is a sample code one can try:

Code: Select all

#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
#include "stdio.h"

void app_main(void)
{
    int length=4;
    uint16_t *foo = (uint16_t*)malloc(sizeof(uint16_t)*length);
    uint8_t *bar = (uint8_t*)malloc(length);
    
    /* fill in buffers with deterministic data for testing*/
    for(int i=0; i<length; i++)
    {
        *(foo + (sizeof(*foo) * i) ) = i;
        *(bar + (sizeof(*bar) * i) ) = i;
    }

    printf("TEST 1 START: The element values of both buffers should match\n");

    for(int j=0; j<length; j++)
    {
       printf("Character %d in uint16_t buffer expected: %d, got: %d\n", j, j, foo[j]);
       printf("Character %d in uint8_t buffer expected: %d, got: %d\n", j, j, bar[j]);
    }

    printf("TEST 2 START: The element values of both buffers should match\n");

    for(int k=0; k<length; k++)
    {
       printf("Character %d in uint16_t buffer expected: %d, got: %d\n", k, k, foo[sizeof(*foo)*k]);
       printf("Character %d in uint8_t buffer expected: %d, got: %d\n", k, k, bar[sizeof(*bar)*k]);
    }
    while (1)
    {
         ;   
    }
}
The result:

Code: Select all

TEST 1 START: The element values of both buffers should match
Character 0 in uint16_t buffer expected: 0, got: 0
Character 0 in uint8_t buffer expected: 0, got: 0
Character 1 in uint16_t buffer expected: 1, got: 16378
Character 1 in uint8_t buffer expected: 1, got: 1
Character 2 in uint16_t buffer expected: 2, got: 1
Character 2 in uint8_t buffer expected: 2, got: 2
Character 3 in uint16_t buffer expected: 3, got: 0
Character 3 in uint8_t buffer expected: 3, got: 3
TEST 2 START: The element values of both buffers should match
Character 0 in uint16_t buffer expected: 0, got: 0
Character 0 in uint8_t buffer expected: 0, got: 0
Character 1 in uint16_t buffer expected: 1, got: 1
Character 1 in uint8_t buffer expected: 1, got: 1
Character 2 in uint16_t buffer expected: 2, got: 2
Character 2 in uint8_t buffer expected: 2, got: 2
Character 3 in uint16_t buffer expected: 3, got: 3
Character 3 in uint8_t buffer expected: 3, got: 3
You can find the main.c file attached and replace it in the esp-idf-template/main/ folder to test.

Therefore one can see that for dynamically allocated arrays of byte width element one can access by the notation:
bar[j] as well as bar[sizeof(*bar)*k] from the sample code.

For dynamically allocated arrays, with element sizes greater than a byte, one can only access by:
foo[sizeof(*foo)*k] from the sample code.

My question is: Why cant I access an element for example in my uint16_t dynamically allocated array by foo[k]. Where k is the running loop variable. Is this specific to xtensa-gcc?

Thank you.
Attachments
main.c
(1.13 KiB) Downloaded 853 times

User avatar
kolban
Posts: 1683
Joined: Mon Nov 16, 2015 4:43 pm
Location: Texas, USA

Re: Accessing dynamic memory by array notation

Postby kolban » Thu Aug 10, 2017 3:00 pm

Just by reading (I didn't test code), I think your logic error is here:

Code: Select all

    uint16_t *foo = (uint16_t*)malloc(sizeof(uint16_t)*length);
    uint8_t *bar = (uint8_t*)malloc(length);
    
   for(int i=0; i<length; i++)
    {
        *(foo + (sizeof(*foo) * i) ) = i;
        *(bar + (sizeof(*bar) * i) ) = i;
    }
Imagine that you have a variable defines as follows:

Code: Select all

uint16_t *wordPtr = 0x1000;
uint8_t *bytePtr = 0x1000;

wordPtr = wordPtr + 1;
bytePtr = bytePtr + 1;
What do you expect the values of wordPtr and bytePtr to be after this code.

The answer is:

Code: Select all

wordPtr == 0x1002;
bytePtr == 0x1001;
When you perform math on pointers addition, subtraction, indexing ... the units are in the size of the type of the pointer.

so:

Code: Select all

 *(wordPtr+ (sizeof(*wordPtr) * i) ) = i;
becomes

Code: Select all

  *(wordPtr + (2 * i)) = i;
What I think you want to code is:

Code: Select all

 *(wordPtr+i) = i;
Free book on ESP32 available here: https://leanpub.com/kolban-ESP32

User avatar
martinayotte
Posts: 141
Joined: Fri Nov 13, 2015 4:27 pm

Re: Accessing dynamic memory by array notation

Postby martinayotte » Thu Aug 10, 2017 3:27 pm

@halfro, I think you should read more about the notion of pointers.

Code: Select all

For dynamically allocated arrays, with element sizes greater than a byte, one can only access by:
foo[sizeof(*foo)*k] from the sample code.
In your example, let say k=2, you would access foo[4*2], this means the 8th uint_32, that would means the 32th bytes, which is far from what you wish ... And soon you will get an out-of-bound access exception !

halfro
Posts: 18
Joined: Sat Jul 15, 2017 11:13 am

Re: Accessing dynamic memory by array notation

Postby halfro » Thu Aug 10, 2017 4:45 pm

Thank you everyone. Your insight helped. I believe I was writing out of bounds therefore the test codes were also reading out of bounds.

Changing the code to:

Code: Select all

#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
#include "stdio.h"

void app_main(void)
{
    int length=4;
    uint16_t *foo = (uint16_t*)malloc(sizeof(uint16_t)*length);
    uint8_t *bar = (uint8_t*)malloc(length);
    
    /* fill in buffers with deterministic data for testing*/
    for(int i=0; i<length; i++)
    {
        *(foo + i ) = i;
        *(bar + i ) = i;
    }

    printf("TEST 1 START: The element values of both buffers should match\n");

    for(int j=0; j<length; j++)
    {
       printf("Character %d in uint16_t buffer expected: %d, got: %d\n", j, j, foo[j]);
       printf("Character %d in uint8_t buffer expected: %d, got: %d\n", j, j, bar[j]);
    }

    printf("TEST 2 START: The element values of both buffers should match\n");

    for(int k=0; k<length; k++)
    {
       printf("Character %d in uint16_t buffer expected: %d, got: %d\n", k, k,  foo[sizeof(*foo)*k]);
       printf("Character %d in uint8_t buffer expected: %d, got: %d\n", k, k, bar[sizeof(*bar)*k]);
    }
    while (1)
    {
         ;   
    }
}
Made it pass test 1 as below, test 2 is thus reading out of the array bounds if k>=2:

Code: Select all

TEST 1 START: The element values of both buffers should match
Character 0 in uint16_t buffer expected: 0, got: 0
Character 0 in uint8_t buffer expected: 0, got: 0
Character 1 in uint16_t buffer expected: 1, got: 1
Character 1 in uint8_t buffer expected: 1, got: 1
Character 2 in uint16_t buffer expected: 2, got: 2
Character 2 in uint8_t buffer expected: 2, got: 2
Character 3 in uint16_t buffer expected: 3, got: 3
Character 3 in uint8_t buffer expected: 3, got: 3
TEST 2 START: The element values of both buffers should match
Character 0 in uint16_t buffer expected: 0, got: 0
Character 0 in uint8_t buffer expected: 0, got: 0
Character 1 in uint16_t buffer expected: 1, got: 2
Character 1 in uint8_t buffer expected: 1, got: 1
Character 2 in uint16_t buffer expected: 2, got: 58288
Character 2 in uint8_t buffer expected: 2, got: 2
Character 3 in uint16_t buffer expected: 3, got: 8
Character 3 in uint8_t buffer expected: 3, got: 3
Thank you for the help.
Regards.

Who is online

Users browsing this forum: iseries1 and 135 guests