Contact MVA2025

Small Multi-Object Tracking for Spotting Birds (SMOT4SB) Challenge 2025

Overview

In conjunction with MVA2025, we host the Small Multi-Object Tracking for Spotting Birds (SMOT4SB) Challenge. This challenge builds upon the MVA2023 Small Object Detection for Spotting Birds (SOD4SB) Challenge [1] by extending the task from static image-based detection to dynamic video-based tracking. The primary goal is to detect and track small birds captured by unmanned aerial vehicles (UAVs) across multiple frames.

 

Unlike conventional small object detection tasks, this challenge introduces motion information to enhance detection accuracy. Inspired by the human visual system, particularly areas such as MT/V5 and the dorsal stream, which specialize in motion perception, this challenge explores whether incorporating temporal information can improve the detection performance of small birds that are otherwise difficult to identify in static images.

Additionally, beyond just detection, this challenge introduces an small multi-object tracking task, where participants must assign consistent IDs to individual birds across frames. This added complexity aims to advance research in Small Object Recognition and improve machine vision techniques for dynamic and cluttered environments.

 

As with the previous challenge, SMOT4SB aligns with MVA’s mission of bridging academia and industry in the field of machine vision and its applications. The challenge is designed not only to drive fundamental research but also to foster practical applications. Potential real-world use cases include:

Through this challenge, we aim to stimulate research in small object tracking while encouraging the development of practical solutions for real-world challenges in UAV perception and automated surveillance.

Announcements

Task

The SMOT4SB Challenge extends the SOD problem into a multi-object tracking (MOT) task. Participants must track small birds across multiple frames in video sequences captured by UAVs.

Unlike static image-based SOD, this challenge requires:

To successfully address this challenge, participants must develop MOT models capable of overcoming the following difficulties:

Participants are encouraged to explore approaches such as:

The dataset includes annotated bird tracking sequences, allowing participants to develop and evaluate their methods under realistic UAV conditions.

By participating in SMOT4SB, researchers and engineers will contribute to advancing small object tracking methodologies with applications in autonomous UAV navigation, ecological monitoring, and bird strike prevention systems.

Dataset

The dataset for the SMOT4SB Challenge extends the SOD4SB dataset [1] by incorporating tracking IDs into video sequences. The dataset is structured into two main parts:

Participants are encouraged to utilize the Pre-training Data to build robust object detection models before fine-tuning them on the Tracking Dataset for multi-frame bird tracking.

Evaluation

In this challenge, we focus on evaluating tracking performance with an emphasis on HOTA (Higher Order Tracking Accuracy) [3], which explicitly considers detection, localization, and association.

However, because this challenge deals with small object tracking, traditional IoU-based evaluation methods commonly used in general tracking face significant challenges. IoU metrics tend to be overly sensitive to localization errors when applied to small objects, leading to unreliable evaluations.

To address this issue, we propose a novel metric called SO-HOTA (Small Object HOTA), inspired by HOTA but specifically designed for small object tracking. For detection and localization evaluation, we adopt Dot Detection (DotD) [4], a method better suited for evaluating small objects compared to conventional IoU-based approaches. Please refer to “SO-HOTA DETAILS” for details of this evaluation metric.

The final rankings in this challenge will be determined solely based on SO-HOTA.

Additionally, a challenge report will be published after the competition, and the top-ranked participants will be invited as co-authors. This report will include an extensive analysis incorporating not only SO-HOTA but also traditional MOT evaluation metrics (e.g., MOTA, IDF1), as well as computational speed and complexity evaluations.

Proposed Evaluation Metric: SO-HOTA (Small Object HOTA)

SO-HOTA adapts the HOTA [3] framework for evaluating tracking performance specifically for small objects. Instead of relying on IoU for similarity scoring, SO-HOTA uses Dot Distance (DotD) [4] for similarity scoring, which compares precise point-like object representations. This is particularly effective for small objects, where IoU-based evaluation often underperforms due to sensitivity to spatial misalignments.

Mathematical Definition

1. Dot Distance (DotD) [4]

DotD measures the normalized Euclidean distance between the centroids of predicted and ground-truth bounding boxes. For a predicted bounding box \( A \) and a ground-truth bounding box \( B \), DotD is given by:

\[ \text{DotD}(A, B) = \exp\left(-\frac{d(A, B)}{s}\right) \]

Where:

  • \( d(A, B) \): The Euclidean distance between the centroids of \( A \) and \( B \):
  • \[ d(A, B) = \sqrt{(x_A - x_B)^2 + (y_A - y_B)^2} \]
  • \( s \) : The average size of all objects in the dataset:
  • \[ s = \sqrt{\frac{\sum_{i=1}^{M} \sum_{j=1}^{N_i} w_{ij} \cdot h_{ij}}{\sum_{i=1}^{M} N_i}} \]
2. Matching Predictions and Ground Truth

A one-to-one matching between ground-truth and predicted points is established using the Hungarian algorithm, maximizing the sum of DotD similarity scores. A match is valid only if \( \text{DotD}(A, B) \geq \alpha \), where \( \alpha \) is a threshold.

3. True Positives, False Positives, and False Negatives

To calculate the performance metrics, we define the following:

  • \( TP \) (True Positives): Valid matches with \( \text{DotD}(A, B) \geq \alpha \).
  • \( FP \) (False Positives): Predicted points not matched to any ground truth.
  • \( FN \) (False Negatives): Ground-truth points not matched to any prediction.

These definitions form the basis for computing detection and association accuracies.

4. SO-HOTA Scoring

Using the definitions of \( TP \), \( FP \), and \( FN \) from the previous section, SO-HOTA integrates detection accuracy (\( \text{DetA} \)) and association accuracy (\( \text{AssA} \)) as follows:

\[ \text{DetA}_\alpha = \frac{|TP|}{|TP| + |FN| + |FP|} \] \[ \text{AssA}_\alpha = \frac{1}{|TP|} \sum_{c \in \text{TP}} \frac{|TPA(c)|}{|TPA(c)| + |FNA(c)| + |FPA(c)|} \]

Where:

  • \( TPA(c) \): True positive associations for track \( c \).
  • \( FNA(c) \): False negative associations for track \( c \).
  • \( FPA(c) \): False positive associations for track \( c \).

The final SO-HOTA score for a given threshold \( \alpha \) is then computed as:

\[ \text{SO-HOTA}_\alpha = \sqrt{\text{DetA}_\alpha \cdot \text{AssA}_\alpha} \]
5. Integration over Thresholds

The final SO-HOTA score is obtained by averaging over a range of thresholds \( \alpha \) from 0.05 to 0.95 in increments of 0.05:

\[ \text{SO-HOTA} = \frac{1}{19} \sum_{\alpha \in \{0.05, 0.10, \dots, 0.95\}} \text{SO-HOTA}_\alpha \]

Key Advantages of SO-HOTA

  • Robust to Localization Errors: DotD focuses on centroids, making it less sensitive to object size variations and boundary alignment issues.
  • Tailored for Small Objects: Unlike IoU, DotD effectively evaluates performance on small objects with minimal spatial extent.
  • Balanced Evaluation: SO-HOTA inherits HOTA's balanced consideration of detection, localization, and association.

Baseline

The baseline code for this challenge is available at: GitHub Repository (TBA)

Important dates

EventDate (23:59 PST)
Site online2025.1.21
Dataset and baseline code release2025.2.4
Public test server open2025.2.4
Public test server close2025.4.26
Code submission deadline2025.5.10
Preliminary private test results2025.5.31
Paper submission deadline2025.6.28
Notification2025.6.30
Camera-ready deadline2025.7.6

Please note that the schedule is subject to change.

Prizes & Awards

This Challenge offers cash prizes and awards, along with free admission to MVA2025 for award recipients. Additionally, among the top-ranked participants, those whose own technical paper describing their proposed method is accepted through the peer-review process will be granted the right to present their work in the special session of this challenge at MVA2025.

Rank PrizeMoney Award
1st 200,000 JPY Best Solution Award
2nd 150,000 JPY Runner-Up Solution Award
3rd 100,000 JPY Honorable Mention Solution Award
4th – 5th 50,000 JPY -

A chance to win the Best Booster Award will be given to participants. This award will be presented to the individual who is most actively contributing to discussions on the Discord channel. The evaluation criteria will be based on the quality of discussions and the number of positive reactions received. The winner of this award will also receive free admission to MVA2025.

Registration

If you wish to participate in this challenge, please register for the challenge in Codabench to receive email notifications for the challenge, and you can then join the Discord channel from email.

👉Codabench page  

Discussion

A dedicated Discord channel will be available for discussions among participants.

Submission

Participants must submit their detection results for the public test dataset in a zip file, where the results are stored in JSON format. For the private test phase, participants must submit their trained models along with their test scripts before the deadline via the Google Form (TBA).

After the Preliminary private test results, only the top 1-5 ranked participants will be invited to co-author the challenge report. This report will summarize the competition results and analysis, including detailed evaluations beyond the ranking metric (SO-HOTA), such as traditional MOT evaluation metrics (MOTA, IDF1, etc.), computation speed, and computational cost. Participation in this challenge report as a co-author is mandatory for all top-ranked participants.

Additionally, each of the top 1-5 ranked participants will be granted the opportunity to submit their own technical paper describing their proposed method. These papers must follow the format and submission guidelines of the MVA main conference. Submitted papers will undergo a peer-review process, and accepted papers will be presented orally in the special session of this challenge. Accepted authors may also choose to present their work as a poster.

Further details regarding the submission process will be communicated directly to the winners.

Challenge organizers

Technical Event Chairs


Norimichi Ukita
Toyota Technological Institute


Yuki Kondo
TOYOTA Motor Corporation

Staff


Riku Kanayama
Toyota Technological Institute


Yuki Yoshida
Toyota Technological Institute

Contributor


Takayuki Yamaguchi
Iwate Agricultural Research Center

Adviser


Masatsugu Kidode
Nara Institute of Science and Technology

Citing SMOT4SB Challenge 2025

If you use the dataset, evaluation metrics, or baseline code from the SMOT4SB Challenge 2025 in your research, please cite the following papers accordingly.

The first citation refers to the challenge report summarizing the dataset, evaluation methodology, and results, which will be published after the competition. The second citation is for the baseline code used in this challenge.

@inproceedings{mva2025_smot4sb_challenge,
  title={{MVA2025 Small Multi-Object Tracking for Spotting Birds Challenge: Dataset, Methods, and Results}},
  author={Yuki Kondo and Norimichi Ukita and Riku Kanayama and Yuki Yoshida and Takayuki Yamaguchi and [Challenge winners]},
  booktitle={2025 19th International Conference on Machine Vision and Applications (MVA)},
  note={\url{https://www.mva-org.jp/mva2025/challenge}},
  year={2025}}
Note: This paper is scheduled to be published in July 2025, and the title and other details are subject to change.
@misc{baselinecode_mva2025_smot4sb_challenge,
  title={{Baseline code for SMOT4SB by IIM-TTIJ}},
  author={Riku Kanayama and Yuki Yoshida and Yuki Kondo},
  license={MIT},
  url={\url{https://github.com/IIM-TTIJ/MVA2025-SMOT4SB}},
  year={2025}}

References

[1].
Y. Kondo, N. Ukita, T. Yamaguchi, H.-Y. Hou, M.-Y. Shen, C.-C. Hsu, E.-M. Huang, Y.-C. Huang, Y.-C. Xia, C.-Y. Wang, C.-Y. Lee, D. Huo, M. A. Kastner, T. Liu, Y. Kawanishi, T. Hirayama, T. Komamizu, I. Ide, Y. Shinya, X. Liu, G. Liang, and S. Yasui, "MVA2023 Small Object Detection Challenge for Spotting Birds: Dataset, Methods, and Results," in Proceedings of the 18th International Conference on Machine Vision and Applications (MVA), 2023. Available: https://www.mva-org.jp/mva2023/challenge
[2].
S. Fujii, K. Akita, N. Ukita, "Distant Bird Detection for Safe Drone Flight and Its Dataset" in Proceedings of 17th International Conference on Machine Vision and Applications (MVA), 2021.
[3].
J. Luiten, A. Osep, P. Dendorfer, P. Torr, A. Geiger, L. Leal-Taixé, and B. Leibe, "HOTA: A Higher Order Metric for Evaluating Multi-Object Tracking,", International Journal of Computer Vision (IJCV), 2021.
[4].
C. Xu, J. Wang, W. Yang, and L. Yu, "Dot Distance for Tiny Object Detection in Aerial Images," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), 2021.

Contact

Google form

 


Please share!