Supervisory authorities


Nom tutelle 1

Our partners

Nom tutelle 2 Nom tutelle 3


Nom tutelle 2 Nom tutelle 3

Home > Seminars > Archive of past UTINAM seminars > 2021

David Cornu

Winning the SKA Science Data Challenge 2 with fast dedicated CNN architectures

Thursday October 7th 2020, 13:30

Conference room of the observatory

David Cornu

LERMA, Observatoire de Paris


With its 1 TB simulated data cube of HI line emission, the SKA Science Data Challenge 2 (SDC2) is getting closer to the difficulty of real upcoming SKA observation analysis. Even if the type of task to perform in the SKA SDC are rather classical, modern dataset has become heavily demanding for classical approaches due to dataset size and dimensionality. For this reason, astronomers started to focus their work on Machine Learning approaches that demonstrated their efficiency in similar applications. However, hyperspectral images from interferometers are very different from images used to train state-of-the-art pattern recognition algorithms (noise level, contrast, object size, class imbalance, spectral dimensionality, etc.) As a direct consequence, these methods do not perform as good as expected when directly applied to astronomical datasets.

In this context, a team from the MINERVA (MachINe lEarning for Radioastronomy at obserVatoire de PAris) project registered to the challenge with the objective of developing innovative Machine Learning methods. In this presentation we will describe the work we have made on implementing the modern YOLO (You Only Look Once) CNN architecture designed for object detection inside our custom framework CIANNA (Convolutional Interactive Artificial Neural Networks by/for Astrophysicists) and detail the modifications and tuning that allowed us to reach the first place of the SKA SDC2 (including catalog merging with a more pedestrian CNN approach). We will discuss the strengths and weaknesses of this type of architecture in comparison to more widely adopted Region-Based CNN (Faster R-CNN, Mask R-CNN, ...). We will also review the numerous changes we made to the network (data quantization, 3D convolution, layer architecture, detection layout to manage blending, additional parameter inference, etc.) in order to apply it to both SDC1 and SDC2, and identify what are the present limits as well as some tracks for further improvements. Finally, we will comment on how this methodology could be used to analyze the actual data from SKA pathfinders or any other similar astronomical dataset.