How can I get the VAD example working on my ESP32-S3 DevKit-C?

barry2212
Posts: 1
Joined: Wed Feb 19, 2025 3:18 pm

How can I get the VAD example working on my ESP32-S3 DevKit-C?

Postby barry2212 » Wed Feb 19, 2025 3:41 pm

Hi, I'm trying to get the speech_recognition/vad example working with my ESP32-S3 DevKit-C and INMP441 microphone. I've wired everything on a breadboard and modified the ESP32-S3 Box I2S configuration to match my setup. The code starts up, but it doesn't detect any voice activity.
During testing, there was no noise to interfere with the speech detection.
What could be causing this issue, and how can I make my custom setup work?]
Thank all.

ESP32S3 output:

Code: Select all

I (26) boot: ESP-IDF v5.3.2 2nd stage bootloader
I (27) boot: compile time Feb 19 2025 22:08:58
I (27) boot: Multicore bootloader
I (30) boot: chip revision: v0.2
I (33) boot: efuse block revision: v1.3
I (38) boot.esp32s3: Boot SPI Speed : 80MHz
I (43) boot.esp32s3: SPI Mode       : DIO
I (48) boot.esp32s3: SPI Flash Size : 4MB
I (52) boot: Enabling RNG early entropy source...
I (58) boot: Partition Table:
I (61) boot: ## Label            Usage          Type ST Offset   Length
I (69) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (76) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (83) boot:  2 factory          factory app      00 00 00010000 00100000
I (91) boot: End of partition table
I (95) esp_image: segment 0: paddr=00010020 vaddr=3c040020 size=123d8h ( 74712) map
I (117) esp_image: segment 1: paddr=00022400 vaddr=3fc94400 size=02c4ch ( 11340) load
I (120) esp_image: segment 2: paddr=00025054 vaddr=40374000 size=0afc4h ( 44996) load
I (132) esp_image: segment 3: paddr=00030020 vaddr=42000020 size=32e48h (208456) map
I (170) esp_image: segment 4: paddr=00062e70 vaddr=4037efc4 size=05418h ( 21528) load
I (181) boot: Loaded app from partition at offset 0x10000
I (182) boot: Disabling RNG early entropy source...
I (193) cpu_start: Multicore app
I (202) cpu_start: Pro cpu start user code
I (202) cpu_start: cpu freq: 160000000 Hz
I (203) app_init: Application information:
I (205) app_init: Project name:     example_vad
I (211) app_init: App version:      1
I (215) app_init: Compile time:     Feb 19 2025 22:07:42
I (221) app_init: ELF file SHA256:  4914dda38...
I (226) app_init: ESP-IDF:          v5.3.2
I (231) efuse_init: Min chip rev:     v0.0
I (236) efuse_init: Max chip rev:     v0.99
I (241) efuse_init: Chip rev:         v0.2
I (246) heap_init: Initializing. RAM available for dynamic allocation:
I (253) heap_init: At 3FC97C90 len 00051A80 (326 KiB): RAM
I (259) heap_init: At 3FCE9710 len 00005724 (21 KiB): RAM
I (265) heap_init: At 3FCF0000 len 00008000 (32 KiB): DRAM
I (271) heap_init: At 600FE100 len 00001EE8 (7 KiB): RTCRAM
I (279) spi_flash: detected chip: generic
I (282) spi_flash: flash io: dio
W (286) ADC: legacy driver is deprecated, please migrate to `esp_adc/adc_oneshot.h`
I (294) sleep: Configure to isolate all GPIO pins in sleep state
I (301) sleep: Enable automatic switching of GPIO sleep configuration
I (309) main_task: Started on CPU0
I (319) main_task: Calling app_main()
I (319) EXAMPLE-VAD: [ 1 ] Start codec chip
W (319) i2c_bus_v2: I2C master handle is NULL, will create new one
E (329) i2c.master: I2C transaction unexpected nack detected
E (339) i2c.master: s_i2c_synchronous_transaction(888): I2C transaction failed
I (3209) EXAMPLE-VAD: [ 2 ] Create audio pipeline for recording
I (3219) EXAMPLE-VAD: [2.1] Create i2s stream to read audio data from codec chip
I (3229) EXAMPLE-VAD: [2.2] Create filter to resample audio data
I (3229) EXAMPLE-VAD: [2.3] Create raw to receive data
I (3239) EXAMPLE-VAD: [ 3 ] Register all elements to audio pipeline
I (3239) EXAMPLE-VAD: [ 4 ] Link elements together [codec_chip]-->i2s_stream-->filter-->raw-->[VAD]
I (3249) EXAMPLE-VAD: [ 5 ] Start audio_pipeline
W (3259) AUDIO_THREAD: Make sure selected the `CONFIG_SPIRAM_BOOT_INIT` and `CONFIG_SPIRAM_ALLOW_STACK_EXTERNAL_MEMORY` by `make menuconfig`
I (3269) EXAMPLE-VAD: [ 6 ] Initialize VAD handle

Pin Configuration:

Code: Select all

        
 i2s_config->bck_io_num = GPIO_NUM_12;
i2s_config->ws_io_num = GPIO_NUM_13;
 i2s_config->data_out_num = GPIO_NUM_15;
 i2s_config->data_in_num = GPIO_NUM_16;
 i2s_config->mck_io_num = GPIO_NUM_2;

qingvsyu
Posts: 2
Joined: Sat Feb 21, 2026 2:52 pm

Re: How can I get the VAD example working on my ESP32-S3 DevKit-C?

Postby qingvsyu » Sat Feb 21, 2026 2:56 pm

Hello, has the problem been resolved?;
Me too;

I (29) boot: ESP-IDF v5.5.1-dirty 2nd stage bootloader
I (29) boot: compile time Feb 21 2026 22:21:23
I (29) boot: Multicore bootloader
I (30) boot: chip revision: v0.2
I (33) boot: efuse block revision: v1.3
I (36) boot.esp32s3: Boot SPI Speed : 80MHz
I (40) boot.esp32s3: SPI Mode : DIO
I (44) boot.esp32s3: SPI Flash Size : 16MB
I (48) boot: Enabling RNG early entropy source...
I (52) boot: Partition Table:
I (55) boot: ## Label Usage Type ST Offset Length
I (61) boot: 0 nvs WiFi data 01 02 00009000 00006000
I (68) boot: 1 phy_init RF data 01 01 0000f000 00001000
I (74) boot: 2 factory factory app 00 00 00010000 00200000
I (81) boot: 3 model Unknown data 01 82 00210000 0050c000
I (87) boot: End of partition table
I (91) esp_image: segment 0: paddr=00010020 vaddr=3c080020 size=13744h ( 79684) map
I (112) esp_image: segment 1: paddr=0002376c vaddr=3fc95100 size=052ech ( 21228) load
I (117) esp_image: segment 2: paddr=00028a60 vaddr=40374000 size=075b8h ( 30136) load
I (124) esp_image: segment 3: paddr=00030020 vaddr=42000020 size=7156ch (464236) map
I (206) esp_image: segment 4: paddr=000a1594 vaddr=4037b5b8 size=09ad8h ( 39640) load
I (215) esp_image: segment 5: paddr=000ab074 vaddr=50000000 size=00020h ( 32) load
I (223) boot: Loaded app from partition at offset 0x10000
I (223) boot: Disabling RNG early entropy source...
I (233) octal_psram: vendor id : 0x0d (AP)
I (233) octal_psram: dev id : 0x02 (generation 3)
I (233) octal_psram: density : 0x03 (64 Mbit)
I (235) octal_psram: good-die : 0x01 (Pass)
I (239) octal_psram: Latency : 0x01 (Fixed)
I (244) octal_psram: VCC : 0x01 (3V)
I (248) octal_psram: SRF : 0x01 (Fast Refresh)
I (253) octal_psram: BurstType : 0x01 (Hybrid Wrap)
I (258) octal_psram: BurstLen : 0x01 (32 Byte)
I (262) octal_psram: Readlatency : 0x02 (10 cycles@Fixed)
I (267) octal_psram: DriveStrength: 0x00 (1/1)
I (272) MSPI Timing: PSRAM timing tuning index: 4
I (276) esp_psram: Found 8MB PSRAM device
I (280) esp_psram: Speed: 80MHz
I (283) cpu_start: Multicore app
I (712) esp_psram: SPI SRAM memory test OK
I (721) cpu_start: Pro cpu start user code
I (721) cpu_start: cpu freq: 240000000 Hz
I (721) app_init: Application information:
I (721) app_init: Project name: project-WS2812
I (725) app_init: App version: ca1b607-dirty
I (730) app_init: Compile time: Feb 21 2026 22:21:08
I (735) app_init: ELF file SHA256: 1645cb60a...
I (739) app_init: ESP-IDF: v5.5.1-dirty
I (743) efuse_init: Min chip rev: v0.0
I (747) efuse_init: Max chip rev: v0.99
I (751) efuse_init: Chip rev: v0.2
I (755) heap_init: Initializing. RAM available for dynamic allocation:
I (761) heap_init: At 3FC9ADC8 len 0004E948 (314 KiB): RAM
I (766) heap_init: At 3FCE9710 len 00005724 (21 KiB): RAM
I (772) heap_init: At 3FCF0000 len 00008000 (32 KiB): DRAM
I (777) heap_init: At 600FE000 len 00001FE8 (7 KiB): RTCRAM
I (782) esp_psram: Adding pool of 8192K of PSRAM memory to heap allocator
I (789) spi_flash: detected chip: boya
I (792) spi_flash: flash io: dio
I (795) sleep_gpio: Configure to isolate all GPIO pins in sleep state
I (801) sleep_gpio: Enable automatic switching of GPIO sleep configuration
I (808) main_task: Started on CPU0
I (818) esp_psram: Reserving pool of 32K of internal memory for DMA/internal allocations
I (818) main_task: Calling app_main()
-----INMP441+++++v5+
初始化
I (828) MODEL_LOADER: The storage free size is 23936 KB
I (828) MODEL_LOADER: The partition size is 5168 KB
I (838) MODEL_LOADER: Successfully load srmodels
I (838) AFE_CONFIG: Set Noise Suppression Model: nsnet2
I (848) AFE_CONFIG: Set WakeNet Model: wn9s_hilexin
I (848) AFE_CONFIG: Set Second WakeNet Model: wn9s_nihaoxiaozhi
wn9s_hilexinwn9s_hilexinW (858) AFE_CONFIG: The playback reference channel is 0, the AEC is deactivated.
MC Quantized wakenet9s: wakenet9s_tts2h8_嗨乐鑫_3_0.630_0.635, tigger:v4, mode:0, p:0, (Dec 16 2025 14:45:55)
MC Quantized wakenet9s: wakenet9s_tts2h8v2_你好小智_3_0.630_0.635, tigger:v4, mode:0, p:0, (Dec 16 2025 14:45:55)
I (928) AFE: AFE Version: (1MIC_V251128)
I (928) AFE: Input PCM Config: total 1 channels(1 microphone, 0 playback), sample rate:16000
I (938) AFE: AFE Pipeline: [input] -> xiaozhi)| -> |VAD(WebRTC)| -> |WakeNet(wn9s_hilexin,wn9s_nihaoxiaozhi)| -> [output]
--空闲堆大小: 8316500 bytes

Who is online

Users browsing this forum: No registered users and 3 guests