SPASS dataset: A synthetic polyphonic dataset with spatiotemporal labels of sound sources (Q6723786)

From MaRDI portal





Dataset published at Zenodo repository.
Language Label Description Also known as
default for all languages
No label defined
    English
    SPASS dataset: A synthetic polyphonic dataset with spatiotemporal labels of sound sources
    Dataset published at Zenodo repository.

      Statements

      0 references
      SPASS is a synthetic dataset that consists of 10-seconds audio segments from 5 acoustic scenes: Park Square Street Waterfront Market Each acoustic scene has 5,000 audio recordings and its corresponding metadata. The audio recordings were created using a 3D acoustic simulation environment (RAVEN, https://www.virtualacoustics.org/RAVEN/). SPASS was made as a training dataset for the FuSA system (https://www.acusticauach.cl/fusa/). This is a polyphonic dataset for Sound Event Detection (SED) tasks. The metadata files includes the class of each sound event, their onset and offset in time, the position in the space (cartesian) and their final position if the class was moving. This research was funded by ANID FONDEF grant number ID20I10333.
      0 references
      26 December 2022
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      1.0
      0 references

      Identifiers

      0 references