Recent object detection models require large amounts of annotated data for training a new classes of objects.
Few-shot object detection (FSOD) aims to address this problem by learning novel classes given only a few samples.
While competitive results have been achieved using two-stage FSOD detectors, typically one-stage FSODs underperform compared to them.
We make the observation that the large gap in performance between two-stage and one-stage FSODs are mainly due to their weak discriminability,
which is explained by a small post-fusion receptive field and a small number of foreground samples in the loss function. To address these limitations,
we propose the Few-shot RetinaNet (FSRN) that consists of: a multi-way support training strategy to augment the number of foreground samples for dense meta-detectors,
an early multi-level feature fusion providing a wide receptive field that covers the whole anchor area and two augmentation techniques on query
and source images to enhance transferability. Extensive experiments show that the proposed approach addresses the limitations and boosts both discriminability and transferability.
FSRN is almost two times faster than two-stage FSODs while remaining competitive in accuracy, and it outperforms the state-of-the-art of one-stage meta-detectors and also some
two-stage FSODs on the MS-COCO and PASCAL VOC benchmarks.
@article{tdtosfsod,abbr={WACV},url={https://arxiv.org/abs/2210.05783},author={Guirguis, K. and Abdelsamad, M. and Eskandar, G. and Hendawy, A. and Kayser, M. and Yang, B. and Beyerer, J.},title={Towards Discriminative and Transferable One-Stage Few-Shot Object Detectors},publisher={Winter Conference on Applications of Computer Vision (WACV) 2023},year={2022},copyright={arXiv.org perpetual, non-exclusive license},selected={false},bibtex_show={true}}
L3D-IVU
CFA: Constraint-based Finetuning Approach for Generalized Few-Shot Object Detection
Guirguis, K.,
Hendawy, A., Eskandar, G., Abdelsamad, M., Kayser, M., and Beyerer, J.
Few-shot object detection (FSOD) seeks to detect novel categories with limited data by leveraging prior knowledge from abundant base data.
Generalized few-shot object detection (G-FSOD) aims to tackle FSOD without forgetting previously seen base classes and,
thus, accounts for a more realistic scenario, where both classes are encountered during test time. While current FSOD methods suffer from
catastrophic forgetting, G-FSOD addresses this limitation yet exhibits a performance drop on novel tasks compared to the state-of-the-art FSOD.
In this work, we propose a constraint-based finetuning approach (CFA) to alleviate catastrophic forgetting, while achieving competitive results
on the novel task without increasing the model capacity. CFA adapts a continual learning method, namely Average Gradient Episodic Memory (A-GEM) to G-FSOD.
Specifically, more constraints on the gradient search strategy are imposed from which a new gradient update rule is derived, allowing for better knowledge exchange
between base and novel classes. To evaluate our method, we conduct extensive experiments on MS-COCO and PASCAL-VOC datasets. Our method outperforms current FSOD and
G-FSOD approaches on the novel task with minor degeneration on the base task. Moreover, CFA is orthogonal to FSOD approaches and operates as a plug-and-play module without
increasing the model capacity or inference time.
@article{cfa,abbr={L3D-IVU},url={https://arxiv.org/abs/2204.05220},author={Guirguis, K. and Hendawy, A. and Eskandar, G. and Abdelsamad, M. and Kayser, M. and Beyerer, J.},title={CFA: Constraint-based Finetuning Approach for Generalized Few-Shot Object Detection},publisher={Workshop on Learning with Limited Labelled Data for Image and Video Understanding (L3D-IVU)},year={2022},copyright={arXiv.org perpetual, non-exclusive license},selected={true},bibtex_show={true}}
2021
Uni Stuttgart
Constraint-based Optimization Approach for Generalized Few-Shot Object Detection
Identifying the material type of objects is one of the assets for a robust autonomous system.
This can be exemplified by an autonomous vehicle changes its speed given the material of the
ground, a robot vacuum switches between different operating modes based on the type of the
floor, or a rescuing robot detects the existence of humans under building debris. Computer
Vision (CV) is succeeding to outperform its peers in many real-life challenges. Although
Convolution Neural Network (CNN) achieves outstanding performance for material classification given images, it is limited to suitable lightening conditions, distinguishable object
textures, and unblocked objects. Given those limitations, images are no longer the appropriate input for material classification. On the other hand, recent works utilize the Intermediate
Frequency (IF) radar signals for material classification to tackle the former limitations. The
IF signals are high variance in their nature as a result of many factors like oscillation or relative distance of the object in front of the radar. Different preprocessing techniques have been
used for having robust and low variant data fed to a Neural Network (NN) model. However, a
noticeable delay arises during inference. Therefore, we propose a radar-based material classifier which deals with IF signals yet is robust against its high variance. Moreover, former
works succeed in classifying various material types using the IF signals. However, the classification setting requires the object to be in contact with the radar. The burden of non-contact
classification has never been tackled by any former Deep Learning (DL) work. Accordingly,
a new learning setting is proposed to map the high variant non-contact input domain to a low
variant input domain as in the contact case. This approach treats the effect of the distance as
a noise that can be denoised by an appropriate architecture. A modified version of WaveNet
is adopted as our denoising architecture. As a result, a low intra-variance manifold of each
class is formed, which can be easily classified using a shallow NN. Since a public radar-based
dataset for material classification is not available for training and evaluation. To have a well-
defined benchmark, we present a radar-based material classification dataset collected using a
24 GHz Millimeter-wave (mmWave) radar. Since this is the first work to tackle the material
classification problem in the non-contact case, we lack a former work to compare against.
Thus, we adopt a well-known image classification family, ResNet [1] to be our baseline. Our
approach outperforms the classification baseline in terms of test accuracy. In addition, robust
performance is shown in a real-life scenario.
@book{rp_thesis,abbr={Uni Stuttgart},bibtex_show={true},title={Material Identification using MIMO Radars in Non-contact Dynamical Environments},author={Hendawy, A.},year={2021},month=apr,publisher={University of Stuttgart,}}
2018
TUM
A Hybrid Approach for Constrained Deep Reinforcement Learning
Recently, deep reinforcement learning techniques have achieved tangible results for
learning high dimensional control tasks. Due to the trial and error interaction
between the autonomous agent and the environment, the learning phase is unconstrained
and limited to the simulator. Such exploration has an additional drawback
of consuming unnecessary samples at the beginning of the learning process. Model-
based algorithms, on the other hand, handle this issue by learning the dynamics
of the environment. However, model-free algorithms have a higher asymptotic performance
than the model-based one. Our contribution is to construct a hybrid
structured algorithm, that makes use of the benefits of both methods, to satisfy
constraint conditions throughout the learning process. We demonstrate the validity of
our approach by learning a reachability task. The results show complete
satisfaction for the constraint condition, represented by a static obstacle, with less
number of samples and higher performance compared to state-of-the-art model-free
algorithms.
@book{bsc_thesis,abbr={TUM},bibtex_show={true},title={A Hybrid Approach for Constrained Deep Reinforcement Learning},author={Hendawy, A.},year={2018},month=jul,publisher={TUM,},pdf={A_Hybrid_Approach_for_Constrained_Deep_Reinforcement_Learning.pdf}}