Page 3 of 3
Re: Integer division performance ????
Posted: Tue Dec 10, 2019 11:24 pm
by Baldhead
Hi ESP_Angus,
At first i think it's good the way it is.
From 4.1us to 3us ( -Og compiler optimization ).
From 4.1us to 1,9 us ( -Os compiler optimization ).
Thank's for the help.
Re: Integer division performance ????
Posted: Tue Dec 10, 2019 11:46 pm
by Baldhead
Hi ESP_Angus,
Below follows what i'm doing.
Called only in startup.
Code: Select all
static DMA_ATTR uint32_t buf_a[ 15360 ];
static DMA_ATTR lldesc_t dma_desc_buf_a[ 16 ];
static void init_dma_descriptors_a( )
{
for ( uint32_t i = 0 ; i < descriptor_size ; i++ ) // descriptor_size = 16
{
dma_desc_buf_a[i].size = 4092;
dma_desc_buf_a[i].length = 0;
dma_desc_buf_a[i].offset = 0;
dma_desc_buf_a[i].sosf = 0;
dma_desc_buf_a[i].eof = 1; // indicate that are the last node of linked list.
dma_desc_buf_a[i].owner = 1; // the allowed operator is the DMA controller.
dma_desc_buf_a[i].buf = (uint8_t*) ( ( &buf_a[0] ) + ( 1023 * i ) );
if ( i == descriptor_size - 1 )
{
dma_desc_buf_a[i].qe.stqe_next = ( lldesc_t* ) NULL;
}
else
{
dma_desc_buf_a[i].qe.stqe_next = ( lldesc_t* ) &dma_desc_buf_a[i+1];
}
}
Code: Select all
Here i eliminated this instruction in all last transfer node:
dma_desc_buf_a[ i ].qe.stqe_next = ( lldesc_t* ) NULL;
I am only using:
dma_desc_buf_a[ i ].eof = 1;
Called every time you want to send data through dma.
Code: Select all
static inline int fill_dma_descriptor_a ( uint32_t len ) // uint32_t len in bytes. When len = 15360 bytes the function takes 3 us( -Og compiler optimization ). 1,9 us( -Os compiler optimization ).
{
uint32_t length;
if ( len > 15360 ) return -1;
if ( len == 0 ) return -2;
length = 4 * len; // ( 4 * len ) = converte de byte(8 bits) para word(32bits).
if ( length <= 4092 ) // Only need one single descriptor.
{
dma_desc_buf_a[0].length = length;
dma_desc_buf_a[0].eof = 1; // indicate that are the last node of linked list.
return 1;
}
// if ( length > 4092 ) // Need more that a single descriptor.
uint32_t fullBufferNum;
fullBufferNum = (uint32_t) length / 4092;
uint32_t remainderBufferNum;
remainderBufferNum = ( length % 4092 );
for ( uint32_t i = 0 ; i < fullBufferNum ; i++ )
{
dma_desc_buf_a[i].length = 4092;
dma_desc_buf_a[i].eof = 0; // indicate that are not the last node of linked list.
}
if ( remainderBufferNum == 0 ) // Remainder of division are 0. Don't need to allocate more one descriptor.
{
dma_desc_buf_a[ fullBufferNum - 1 ].eof = 1; // indicate that are the last node of linked list.
return 2;
}
else // Need to allocate (statically) 1 more descriptor.
{
dma_desc_buf_a[fullBufferNum].length = remainderBufferNum;
dma_desc_buf_a[fullBufferNum].eof = 1; // indicate that are the last node of linked list.
return 3;
}
}
This configuration through my tests is working.
Do you think this can cause any type of problems ?
My next step will be to implement interrupt.
Can it generate any problem ?
Thank's for your help.
Re: Integer division performance ????
Posted: Wed Dec 11, 2019 4:03 am
by Angus
This configuration through my tests is working.
Do you think this can cause any type of problems ?
Looks fine to me. Of course I can't debug your driver for you, maybe something will need changing here.
My next step will be to implement interrupt.
Can it generate any problem ?
You're asking me if code you haven't written yet might have a problem?
Re: Integer division performance ????
Posted: Wed Dec 11, 2019 5:45 pm
by Baldhead
Hi ESP_Angus,
"You're asking me if code you haven't written yet might have a problem?"
I would like to know if i need or i dont need both "instructions" at the same time on last node of linked list:
dma_desc_buf_a[ last node ].eof = 1;
dma_desc_buf_a[ last node ].qe.stqe_next = ( lldesc_t* ) NULL;
I am only using:
dma_desc_buf_a[ last node ].eof = 1;
In my first stage of driver development i was initializing all fields from my linked list, ie: all lldesc_t fields from each node,
which took a long time to fill.
So i optimized my code and i wonder if this way is ok.
The last stage of my driver development are to implement interrupt and 2 buffer sync plus tearing sync, which i think will give me a lot of headache.
Thank's for your help.
Re: Integer division performance ????
Posted: Thu Dec 12, 2019 3:46 am
by Angus
Hi Baldhead,
I would like to know if i need or i dont need both "instructions" at the same time on last node of linked list:
dma_desc_buf_a[ last node ].eof = 1;
dma_desc_buf_a[ last node ].qe.stqe_next = ( lldesc_t* ) NULL;
Right, sorry I missed that. I checked with the peripheral teams, both fields need to be set: EOF=1 causes an EOF interrupt to be triggered when the descriptor is reached, but the DMA operation will continue until it reaches a field where the next descriptor pointer is NULL.
Re: Integer division performance ????
Posted: Thu Dec 12, 2019 6:27 pm
by Baldhead
Hi ESP_Angus,
"Right, sorry I missed that. I checked with the peripheral teams, both fields need to be set: EOF=1 causes an EOF interrupt to be triggered when the descriptor is reached, but the DMA operation will continue until it reaches a field where the next descriptor pointer is NULL."
Strange.
For me it's working with only this "instruction": dma_desc_buf_a[ last node ].eof = 1.
However i am not currently using interrupt, i am using polling.
I am using polling for testing purposes only.
Thank's.