Page 1 of 1

ALAC Frame Structure Details for Encoded Frames

Posted: Tue Jun 24, 2025 8:54 pm
by otomruk
Hello,

I'm trying to figure out the frame structure of the ALAC-encoded output from the ESP32 encoder. Mainly looking for details like the header, sync word, and how the frame is laid out.
I'm using the example from:
...\managed_components\espressif__esp_audio_codec\test_apps\audio_codec_test\main\audio_encoder_test.c. This is part of the espressif/esp_audio_codec component, version 2.3.0 (latest).

Encoder config is: sample_rate = 16000, channel = 1, bits_per_sample = 16, frame_samples = 4096, fast_mode = false.

I attached a sample of the encoded output in hex. I couldn’t find any documentation on the frame format since the encoder source isn’t open. I’m hoping someone here might help. Each frame always starts with the fixed bytes "00 00 00 00 00 13 08", but the remaining bytes change from frame to frame.

I’d really appreciate any info you can share about how the frames are structured.