Abstract: We aim for an open-vocabulary sound event localization and detection (SELD) system that detects and localizes sound events in any category described by prompt texts. An open-vocabulary SELD ...