Fabrizio Falchi

     

News


VISIONE 2024 won!

VISIONE, our content-based video retrieval system, won the Video Browser Shodown - The Video Retrieval Competition in Amsterdam, The Netherlands.



VISIONE combines many scientific results that our AIMH Lab of the Institute of Information Science and Technologies of Consiglio Nazionale delle Ricerche achieved and published in the last few years.

It is an impressive achievement for which I would like to thank my collegaues: Giuseppe Amato, Paolo Bolettieri, Fabio Carrara, claudio gennaro, Nicola Messina, Lucia Vadicamo, Claudio Vairo.



 VISIONE 5.0: Enhanced User Interface and AI Models for VBS2024

G. Amato, Paolo Bolettieri, F. Carrara, F. Falchi, C. Gennaro, N. Messina, L. Vadicamo, C. Vairo

In Lecture Notes in Computer Science book series, LNCS, volume 14557
MultiMedia Modeling: 30th International Conference, MMM 2024, Amsterdam, The Netherlands, January 29 – February 2, 2024, Proceedings, Part IV
Amsterdam, The Netherlands, January 29 – February 2, 2024
Springer

ISSN: 0302-9743 , ISBN: 978-3-031-27076-5 , eISBN: 978-3-031-27077-2 , Scopus: 2-s2.0-85184824787 , DOI: 10.1007/978-3-031-53302-0_29

In this paper, we introduce the fifth release of VISIONE, an advanced video retrieval system offering diverse search functionalities. The user can search for a target video using textual prompts, drawing objects and colors appearing in the target scenes in a canvas, or images as query examples to search for video keyframes with similar content. Compared to the previous version of our system, which was runner-up at VBS 2023, the forthcoming release, set to participate in VBS 2024, showcases a refined user interface that enhances its usability and updated AI models for more effective video content analysis. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
		@InProceedings{10.1007/978-3-031-27077-2_48,
		author="Amato, Giuseppe and Bolettieri, Paolo and Carrara, Fabio and Falchi, Fabrizio and Gennaro, Claudio and Messina, Nicola and Vadicamo, Lucia and Vairo, Claudio", editor="Dang-Nguyen, Duc-Tien and Gurrin, Cathal and Larson, Martha and Smeaton, Alan F. and Rudinac, Stevan and Dao, Minh-Son and Trattner, Christoph and Chen, Phoebe", 
		title="VISIONE 5.0: Enhanced User Interface and AI Models for VBS2024",
		booktitle="MultiMedia Modeling",
		year="2024",
		publisher="Springer Nature Switzerland",
		address="Cham",
		pages="332--339",
		abstract="In this paper, we introduce the fifth release of VISIONE, an advanced video retrieval system offering diverse search functionalities. The user can search for a target video using textual prompts, drawing objects and colors appearing in the target scenes in a canvas, or images as query examples to search for video keyframes with similar content. Compared to the previous version of our system, which was runner-up at VBS 2023, the forthcoming release, set to participate in VBS 2024, showcases a refined user interface that enhances its usability and updated AI models for more effective video content analysis.",
		isbn="978-3-031-53302-0"
		}
		
Best paper award at CBMI 2024
VISIONE

VISIONE, our content-based video retrieval system, obtained second place at the Video Browser Shodown - The Video Retrieval Competition in Bergen, Norway.



VISIONE combines many scientific results that our AIMH Lab of the Institute of Information Science and Technologies of Consiglio Nazionale delle Ricerche achieved and published in the last few years.

It is an impressive achievement for which I would like to thank my collegaues: Giuseppe Amato, Paolo Bolettieri, Fabio Carrara, claudio gennaro, Nicola Messina, Lucia Vadicamo, Claudio Vairo.



 VISIONE at Video Browser Showdown 2023

G. Amato, Paolo Bolettieri, F. Carrara, F. Falchi, C. Gennaro, N. Messina, L. Vadicamo, C. Vairo

In Lecture Notes in Computer Science book series, LNCS, volume 13833
MultiMedia Modeling: 29th International Conference on MultiMedia Modeling, MMM 2023, Bergen, Norway, January 9–12, 2023, Proceedings, Part I
Bergen, Norway. 9-12 January 2023
Springer

ISSN: 0302-9743 , ISBN: 978-3-031-27076-5 , eISBN: 978-3-031-27077-2 , Scopus: 2-s2.0-85152563075 , DOI: 10.1007/978-3-031-27077-2_48

In this paper, we present the fourth release of VISIONE, a tool for fast and effective video search on a large-scale dataset. It includes several search functionalities like text search, object and color-based search, semantic and visual similarity search, and temporal search. VISIONE uses ad-hoc textual encoding for indexing and searching video content, and it exploits a full-text search engine as search backend. In this new version of the system, we introduced some changes both to the current search techniques and to the user interface.
	@InProceedings{10.1007/978-3-031-27077-2_48,
	author="Amato, Giuseppe and Bolettieri, Paolo and Carrara, Fabio and Falchi, Fabrizio and Gennaro, Claudio and Messina, Nicola and Vadicamo, Lucia and Vairo, Claudio", editor="Dang-Nguyen, Duc-Tien and Gurrin, Cathal and Larson, Martha and Smeaton, Alan F. and Rudinac, Stevan and Dao, Minh-Son and Trattner, Christoph and Chen, Phoebe", 
	title="VISIONE at Video Browser Showdown 2023",
	booktitle="MultiMedia Modeling",
	year="2023",
	publisher="Springer International Publishing",
	address="Cham",
	pages="615--621",
	abstract="In this paper, we present the fourth release of VISIONE, a tool for fast and effective video search on a large-scale dataset. It includes several search functionalities like text search, object and color-based search, semantic and visual similarity search, and temporal search. VISIONE uses ad-hoc textual encoding for indexing and searching video content, and it exploits a full-text search engine as search backend. In this new version of the system, we introduced some changes both to the current search techniques and to the user interface.",
	isbn="978-3-031-27077-2"
	}
	
Rai per il Sociale: Deep Learning
Quattro chiacchiere su Deep Learning con Rai per il Sociale


RaiPlay
La scienza che non c'era - Machine Learning
Fabrizio Falchi parla di Machine Learning in occasione dell'evento
La scienza che non c'era: l'informatica e i prossimi cento anni del CNR
14 Novembre 2023, Auditorium CNR Pisa



Ital-IA 2023
Fabrizio Falchi is co-chair of Ital-IA 2023, the 3rd National Conference on Artificial Intelligence organized by CINI (Pisa, May 29-31 2023).



CVPR 2024


Our paper The devil is in the fine-grained details: Evaluating open-vocabulary object detectors for fine-grained understanding
by Lorenzo Bianchi, Fabio Carrara, Nicola Messina, Claudio Gennaro, Fabrizio Falchi"
has been accepted at IEEE / CVF CVPR 2024 - Computer Vision and Pattern Recognition Conference, the premier annual computer vision event.

About

Fabrizio Falchi is researcher of the Artificial Intelligence for Media and Humanities lab of ISTI-CNR He has a Ph.D. in Information Engineering from University of Pisa (Italy), and a Ph.D. in Informatics from Faculty of Informatics of Masaryk University of Brno (Czech Republic). He also recived an M.B.A. from Scuola Superiore Sant'Anna in Pisa.

His research interests include deep learning, convolutional neural network, deep features, similarity search, distributed indexes, multimedia information retrieval, computer vision, peer-to-peer systems.


Associated at the Biorobotics Institute of Scuola Superiore Santanna

Member of acm since 2012, Fabrizio participates to SIGMM

Member of the Computer Vision Foundation CVF

Member of the Italian Association for Computer Vision Pattern recognition and machine Learning (CVPL) CVPL Member of the CINI Lab on Artificial Intelligence and Intelligent Systems CINI-AIIS


Follow me
Media

Rai per il Sociale
Domande snack.

Intuizione Digitale
I Webinar della Ludoteca del Registro .it.

Coscienza automatica
Intelligenza artificiale e intuizione non sono (più) estranee: il funzionamento del #deeplearning, uno dei metodi alla base degli attuali algoritmi di AI, ha molto in comune con il carattere quasi "magico" delle dinamiche intuitive della #mente umana. Basta per attribuire alle macchine caratteristiche come l'emozione, l'immaginazione, la creatività? Possiamo affermare che con l'"intuizione artificiale" l'AI si distacchi dal paradigma dell'automazione? O addirittura che si tratti del primo passo verso l'acquisizione di una coscienza?

Publications

Selected


 A leap among quantum computing and quantum neural networks: A survey

F.V. Massoli, L. Vadicamo, G. Amato, F. Falchi

In ACM Computing Surveys, vol. 55, issue 5, May 2023, article n. 98, pp. 1-37.

ISSN: 0360-0300 , eISSN: 1557-7341 , WOS: 000892356400012 , Scopus: 2-s2.0-85132292599 , DOI: 10.1145/3529756

In recent years, Quantum Computing witnessed massive improvements in terms of available resources and algorithms development. The ability to harness quantum phenomena to solve computational problems is a long-standing dream that has drawn the scientific community’s interest since the late ’80s. In such a context, we propose our contribution. First, we introduce basic concepts related to quantum computations, and then we explain the core functionalities of technologies that implement the Gate Model and Adiabatic Quantum Computing paradigms. Finally, we gather, compare, and analyze the current state-of-the-art concerning Quantum Perceptrons and Quantum Neural Networks implementations.
@article{10.1145/3529756,
author = {Massoli, Fabio Valerio and Vadicamo, Lucia and Amato, Giuseppe and Falchi, Fabrizio},
title = {A Leap among Quantum Computing and Quantum Neural Networks: A Survey},
year = {2022},
issue_date = {May 2023},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {55},
number = {5},
issn = {0360-0300},
url = {https://doi.org/10.1145/3529756},
doi = {10.1145/3529756},
abstract = {In recent years, Quantum Computing witnessed massive improvements in terms of available resources and algorithms development. The ability to harness quantum phenomena to solve computational problems is a long-standing dream that has drawn the scientific community’s interest since the late ’80s. In such a context, we propose our contribution. First, we introduce basic concepts related to quantum computations, and then we explain the core functionalities of technologies that implement the Gate Model and Adiabatic Quantum Computing paradigms. Finally, we gather, compare, and analyze the current state-of-the-art concerning Quantum Perceptrons and Quantum Neural Networks implementations.},
journal = {ACM Comput. Surv.},
month = {dec},
articleno = {98},
numpages = {37},
keywords = {quantum deep learning, quantum machine learning, quantum neural network, Quantum computing}
}

 Multi-camera vehicle counting using edge-AI

L. Ciampi, C. Gennaro, F. Carrara, F. Falchi, C. Vairo, G. Amato

In Expert Systems with Applications, vol. 207, 30 November 2022, 117929.
Elsevier

ISSN: 0957-4174 , eISSN: 1873-6793 , WOS: 000827527100007 , Scopus: 2-s2.0-85133470139 , DOI: 10.1016/j.eswa.2022.117929

This paper presents a novel solution to automatically count vehicles in a parking lot using images captured by smart cameras. Unlike most of the literature on this task, which focuses on the analysis of single images, this paper proposes the use of multiple visual sources to monitor a wider parking area from different perspectives. The proposed multi-camera system is capable of automatically estimating the number of cars present in the entire parking lot directly on board the edge devices. It comprises an on-device deep learning-based detector that locates and counts the vehicles from the captured images and a decentralized geometric-based approach that can analyze the inter-camera shared areas and merge the data acquired by all the devices. We conducted the experimental evaluation on an extended version of the CNRPark-EXT dataset, a collection of images taken from the parking lot on the campus of the National Research Council (CNR) in Pisa, Italy. We show that our system is robust and takes advantage of the redundant information deriving from the different cameras, improving the overall performance without requiring any extra geometrical information of the monitored scene.
@ARTICLE{Ciampi2022,
	author = {Ciampi, Luca and Gennaro, Claudio and Carrara, Fabio and Falchi, Fabrizio and Vairo, Claudio and Amato, Giuseppe},
	title = {Multi-camera vehicle counting using edge-AI},
	year = {2022},
	journal = {Expert Systems with Applications},
	volume = {207},
	doi = {10.1016/j.eswa.2022.117929},
	publication_stage = {Final},
	source = {Scopus},
	note = {Cited by: 2; All Open Access, Green Open Access}
}

 An embedded toolset for human activity monitoring in critical environments

M. Di Benedetto, F. Carrara, L. Ciampi, F. Falchi, C. Gennaro, G. Amato

In Expert Systems with Applications, vol. 199, 1 August 2022, 117125.
Elsevier

ISSN: 0957-4174 , eISSN: 1873-6793 , WOS: 000794183900002 , Scopus: 2-s2.0-85128232717 , DOI: 10.1016/j.eswa.2022.117125

In many working and recreational activities, there are scenarios where both individual and collective safety have to be constantly checked and properly signaled, as occurring in dangerous workplaces or during pandemic events like the recent COVID-19 disease. From wearing personal protective equipment to filling physical spaces with an adequate number of people, it is clear that a possibly automatic solution would help to check compliance with the established rules. Based on an off-the-shelf compact and low-cost hardware, we present a deployed real use-case embedded system capable of perceiving people’s behavior and aggregations and supervising the appliance of a set of rules relying on a configurable plug-in framework. Working on indoor and outdoor environments, we show that our implementation of counting people aggregations, measuring their reciprocal physical distances, and checking the proper usage of protective equipment is an effective yet open framework for monitoring human activities in critical conditions.
@article{DIBENEDETTO2022117125,
title = {An embedded toolset for human activity monitoring in critical environments},
journal = {Expert Systems with Applications},
volume = {199},
pages = {117125},
year = {2022},
issn = {0957-4174},
doi = {https://doi.org/10.1016/j.eswa.2022.117125},
url = {https://www.sciencedirect.com/science/article/pii/S0957417422005218},
author = {Marco {Di Benedetto} and Fabio Carrara and Luca Ciampi and Fabrizio Falchi and Claudio Gennaro and Giuseppe Amato},
}

 ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval

N. Messina, M. Stefanini, M. Cornia, L. Baraldi, F. Falchi, G. Amato, R. Cucchiara

International Conference on Content-Based Multimedia Indexing (CBMI), pages 64-70.
Graz, Austria. 14-16 Sept 2022

ISBN: 978-1-4503-9720-9 , Scopus: 2-s2.0-85139922183 , DOI: 10.1145/3549555.3549576

Image-text matching is gaining a leading role among tasks involving the joint understanding of vision and language. In literature, this task is often used as a pre-training objective to forge architectures able to jointly deal with images and texts. Nonetheless, it has a direct downstream application: cross-modal retrieval, which consists in finding images related to a given query text or vice-versa. Solving this task is of critical importance in cross-modal search engines. Many recent methods proposed effective solutions to the image-text matching problem, mostly using recent large vision-language (VL) Transformer networks. However, these models are often computationally expensive, especially at inference time. This prevents their adoption in large-scale cross-modal retrieval scenarios, where results should be provided to the user almost instantaneously. In this paper, we propose to fill in the gap between effectiveness and efficiency by proposing an ALign And DIstill Network (ALADIN). ALADIN first produces high-effective scores by aligning at fine-grained level images and texts. Then, it learns a shared embedding space – where an efficient kNN search can be performed – by distilling the relevance scores obtained from the fine-grained alignments. We obtained remarkable results on MS-COCO, showing that our method can compete with state-of-the-art VL Transformers while being almost 90 times faster. The code for reproducing our results is available at https://github.com/mesnico/ALADIN.
@inproceedings{10.1145/3549555.3549576,
author = {Messina, Nicola and Stefanini, Matteo and Cornia, Marcella and Baraldi, Lorenzo and Falchi, Fabrizio and Amato, Giuseppe and Cucchiara, Rita},
title = {ALADIN: Distilling Fine-Grained Alignment Scores for Efficient Image-Text Matching and Retrieval},
year = {2022},
isbn = {9781450397209},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3549555.3549576},
doi = {10.1145/3549555.3549576}

 Learning to Detect Fallen People in Virtual Worlds

F. Carrara, L. Pasco, C. Gennaro, F. Falchi

International Conference on Content-Based Multimedia Indexing (CBMI)
Graz, Austria. 14-16 Sept 2022

ISBN: 978-1-4503-9720-9 , Scopus: 2-s2.0-85139940448 , DOI: 10.1145/3549555.3549573

Falling is one of the most common causes of injury in all ages, especially in the elderly, where it is more frequent and severe. For this reason, a tool that can detect a fall in real time can be helpful in ensuring appropriate intervention and avoiding more serious damage. Some approaches available in the literature use sensors, wearable devices, or cameras with special features such as thermal or depth sensors. In this paper, we propose a Computer Vision deep-learning based approach for human fall detection based on largely available standard RGB cameras. A typical limitation of this kind of approaches is the lack of generalization to unseen environments. This is due to the error generated during human detection and, more generally, due to the unavailability of large-scale datasets that specialize in fall detection problems with different environments and fall types. In this work, we mitigate these limitations with a general-purpose object detector trained using a virtual world dataset in addition to real-world images. Through extensive experimental evaluation, we verified that by training our models on synthetic images as well, we were able to improve their ability to generalize. Code to reproduce results is available at https://github.com/lorepas/fallen-people-detection.
@inproceedings{10.1145/3549555.3549573,
author = {Carrara, Fabio and Pasco, Lorenzo and Gennaro, Claudio and Falchi, Fabrizio},
title = {Learning to Detect Fallen People in Virtual Worlds},
year = {2022},
isbn = {9781450397209},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3549555.3549573},
doi = {10.1145/3549555.3549573},
pages = {126–130},
numpages = {5},
keywords = {scarce data, object detection, visual fallen people detection, virtual worlds for synthetic data},
location = {Graz, Austria},
series = {CBMI '22}
}

 Deep features for CBIR with scarce data using Hebbian learning

G. Lagani, D. Bacciu, C. Gallicchio, F. Falchi, C. Gennaro, G. Amato

International Conference on Content-Based Multimedia Indexing (CBMI), pages 136-141.
Graz, Austria. 14-16 Sept 2022

ISBN: 978-1-4503-9720-9 , Scopus: 2-s2.0-85137486891 , DOI: 10.1145/3549555.3549587

Features extracted from Deep Neural Networks (DNNs) have proven to be very effective in the context of Content Based Image Retrieval (CBIR). Recently, biologically inspired Hebbian learning algorithms have shown promises for DNN training. In this contribution, we study the performance of such algorithms in the development of feature extractors for CBIR tasks. Specifically, we consider a semi-supervised learning strategy in two steps: first, an unsupervised pre-training stage is performed using Hebbian learning on the image dataset; second, the network is fine-tuned using supervised Stochastic Gradient Descent (SGD) training. For the unsupervised pre-training stage, we explore the nonlinear Hebbian Principal Component Analysis (HPCA) learning rule. For the supervised fine-tuning stage, we assume sample efficiency scenarios, in which the amount of labeled samples is just a small fraction of the whole dataset. Our experimental analysis, conducted on the CIFAR10 and CIFAR100 datasets, shows that, when few labeled samples are available, our Hebbian approach provides relevant improvements compared to various alternative methods.
@inproceedings{10.1145/3549555.3549587,
author = {Lagani, Gabriele and Bacciu, Davide and Gallicchio, Claudio and Falchi, Fabrizio and Gennaro, Claudio and Amato, Giuseppe},
title = {Deep Features for CBIR with Scarce Data Using Hebbian Learning},
year = {2022},
isbn = {9781450397209},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3549555.3549587},
doi = {10.1145/3549555.3549587},
booktitle = {Proceedings of the 19th International Conference on Content-Based Multimedia Indexing},
pages = {136–141},
numpages = {6},
keywords = {Hebbian Learning, Content Based Image Retrieval, Neural Networks, Bio-Inspired., Semi-Supervised, Deep Learning},
location = {Graz, Austria},
series = {CBMI '22}
}

 Fine-grained visual textual alignment for cross-modal retrieval using transformer encoders

N. Messina, G. Amato, A. Esuli, F. Falchi, C. Gennaro, S. Marchand-Maillet

In ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), vol. 17, issue 4, pp. 1-23, November 2021.
ACM

ISSN: 1551-6857 , eISSN: 1551-6865 , WOS: 000748857800012 , Scopus: 2-s2.0-85123302427 , DOI: 10.1145/3451390

Despite the evolution of deep-learning-based visual-textual processing systems, precise multi-modal matching remains a challenging task. In this work, we tackle the task of cross-modal retrieval through image-sentence matching based on word-region alignments, using supervision only at the global image-sentence level. Specifically, we present a novel approach called Transformer Encoder Reasoning and Alignment Network (TERAN). TERAN enforces a fine-grained match between the underlying components of images and sentences (i.e., image regions and words, respectively) to preserve the informative richness of both modalities. TERAN obtains state-of-the-art results on the image retrieval task on both MS-COCO and Flickr30k datasets. Moreover, on MS-COCO, it also outperforms current approaches on the sentence retrieval task.Focusing on scalable cross-modal information retrieval, TERAN is designed to keep the visual and textual data pipelines well separated. Cross-attention links invalidate any chance to separately extract visual and textual features needed for the online search and the offline indexing steps in large-scale retrieval systems. In this respect, TERAN merges the information from the two domains only during the final alignment phase, immediately before the loss computation. We argue that the fine-grained alignments produced by TERAN pave the way toward the research for effective and efficient methods for large-scale cross-modal information retrieval. We compare the effectiveness of our approach against relevant state-of-the-art methods. On the MS-COCO 1K test set, we obtain an improvement of 5.7% and 3.5% respectively on the image and the sentence retrieval tasks on the Recall@1 metric. The code used for the experiments is publicly available on GitHub at https://github.com/mesnico/TERAN.
@article{10.1145/3451390,
author = {Messina, Nicola and Amato, Giuseppe and Esuli, Andrea and Falchi, Fabrizio and Gennaro, Claudio and Marchand-Maillet, St\'{e}phane},
title = {Fine-Grained Visual Textual Alignment for Cross-Modal Retrieval Using Transformer Encoders},
year = {2021},
issue_date = {November 2021},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {17},
number = {4},
issn = {1551-6857},
url = {https://doi.org/10.1145/3451390},
doi = {10.1145/3451390},
journal = {ACM Trans. Multimedia Comput. Commun. Appl.},
month = {nov},
articleno = {128},
numpages = {23},
keywords = {natural language processing, cross-modal retrieval, computer vision, Deep learning, multi-modal matching}
}

 Cross-Resolution Learning for Face Recognition

F.V. Massoli, G. Amato, F. Falchi

In Image and Vision Computing, vol. 99, July 2020, 1039227

ISSN: 0262-8856, Online ISSN: 1872-8138
Scopus: 2-s2.0-85085261425, WOS: 000541130800003, DOI: 10.1016/j.imavis.2020.103927

Convolutional Neural Network models have reached extremely high performance on the Face Recognition task. Mostly used datasets, such as VGGFace2, focus on gender, pose, and age variations, in the attempt of balancing them to empower models to better generalize to unseen data. Nevertheless, image resolution variability is not usually discussed, which may lead to a resizing of 256 pixels. While specific datasets for very low-resolution faces have been proposed, less attention has been paid on the task of cross-resolution matching. Hence, the discrimination power of a neural network might seriously degrade in such a scenario. Surveillance systems and forensic applications are particularly susceptible to this problem since, in these cases, it is common that a low-resolution query has to be matched against higher-resolution galleries. Although it is always possible to either increase the resolution of the query image or to reduce the size of the gallery (less frequently), to the best of our knowledge, extensive experimentation of cross-resolution matching was missing in the recent deep learning-based literature. In the context of low- and cross-resolution Face Recognition, the contribution of our work is fourfold: i) we proposed a training procedure to fine-tune a state-of-the-art model to empower it to extract resolution-robust deep features; ii) we conducted an extensive test campaign by using high-resolution datasets (IJB-B and IJB-C) and surveillance-camera-quality datasets (QMUL-SurvFace, TinyFace, and SCface) showing the effectiveness of our algorithm to train a resolution-robust model; iii) even though our main focus was the cross-resolution Face Recognition, by using our training algorithm we also improved upon state-of-the-art model performances considering low-resolution matches; iv) we showed that our approach could be more effective concerning preprocessing faces with super-resolution techniques.

@ARTICLE{Massoli2020,
author={Massoli, F.V. and Amato, G. and Falchi, F.},
title={Cross-resolution learning for Face Recognition},
journal={Image and Vision Computing},
year={2020},
volume={99},
doi={10.1016/j.imavis.2020.103927},
art_number={103927},
note={cited By 1},
url={https://www.scopus.com/inward/record.uri?eid=2-s2.0-85085261425&doi=10.1016%2fj.imavis.2020.103927&partnerID=40&md5=f90a30b8e6ec045344f0e2806af8fe3c},
document_type={Article},
source={Scopus},
}

 Detection of Face Recognition Adversarial Attacks

F.V. Massoli, F. Carrara, G. Amato, F. Falchi

In Computer Vision and Image Understanding. Volume 202, January 2021, 103103

ISSN: 1077-3142
Scopus: 2-s2.0-85090405070, WOS: 000616091100009 DOI: 10.1016/j.cviu.2020.103103

Deep Learning methods have become state-of-the-art for solving tasks such as Face Recognition (FR). Unfortunately, despite their success, it has been pointed out that these learning models are exposed to adversarial inputs – images to which an imperceptible amount of noise for humans is added to maliciously fool a neural network – thus limiting their adoption in sensitive real-world applications. While it is true that an enormous effort has been spent to train robust models against this type of threat, adversarial detection techniques have recently started to draw attention within the scientific community. The advantage of using a detection approach is that it does not require to re-train any model; thus, it can be added to any system. In this context, we present our work on adversarial detection in forensics mainly focused on detecting attacks against FR systems in which the learning model is typically used only as features extractor. Thus, training a more robust classifier might not be enough to counteract the adversarial threats. In this frame, the contribution of our work is four-fold: (i) we test our proposed adversarial detection approach against classification attacks, i.e., adversarial samples crafted to fool an FR neural network acting as a classifier; (ii) using a k-Nearest Neighbor (k-NN) algorithm as a guide, we generate deep features attacks against an FR system based on a neural network acting as features extractor, followed by a similarity-based procedure which returns the query identity; (iii) we use the deep features attacks to fool an FR system on the 1:1 face verification task, and we show their superior effectiveness with respect to classification attacks in evading such type of system; (iv) we use the detectors trained on the classification attacks to detect the deep features attacks, thus showing that such approach is generalizable to different classes of offensives.

@article{MASSOLI2021103103,
 title = "Detection of Face Recognition Adversarial Attacks",
 journal = "Computer Vision and Image Understanding",
 volume = "202",
 pages = "103103",
 year = "2021",
 issn = "1077-3142",
 doi = "https://doi.org/10.1016/j.cviu.2020.103103",
 url = "http://www.sciencedirect.com/science/article/pii/S1077314220301296",
 author = "Fabio Valerio Massoli and Fabio Carrara and Giuseppe Amato and Fabrizio Falchi",
 keywords = "Deep Learning, Face Recognition, Adversarial attacks, Adversarial detection, Adversarial biometrics",
}

 Learning visual features for relational CBIR

N. Messina, G. Amato, F. Carrara, F. Falchi, C. Gennaro

In International Journal of Multimedia Information Retrieval (IJMIR), vol. 9, n. 2, pp. 113-124, Sept 2020.
Springer

ISSN: 2192-6611 , eISSN: 2192-662X , WOS: 000534963100004 , Scopus: 2-s2.0-85073946622 , DOI: 10.1007/s13735-019-00178-7

Recent works in deep-learning research highlighted remarkable relational reasoning capabilities of some carefully designed architectures. In this work, we employ a relationship-aware deep learning model to extract compact visual features used relational image descriptors. In particular, we are interested in relational content-based image retrieval (R-CBIR), a task consisting in finding images containing similar inter-object relationships. Inspired by the relation networks (RN) employed in relational visual question answering (R-VQA), we present novel architectures to explicitly capture relational information from images in the form of network activations that can be subsequently extracted and used as visual features. We describe a two-stage relation network module (2S-RN), trained on the R-VQA task, able to collect non-aggregated visual features. Then, we propose the aggregated visual features relation network (AVF-RN) module that is able to produce better relationship-aware features by learning the aggregation directly inside the network. We employ an R-CBIR ground-truth built by exploiting scene-graphs similarities available in the CLEVR dataset in order to rank images in a relational fashion. Experiments show that features extracted from our 2S-RN model provide an improved retrieval performance with respect to standard non-relational methods. Moreover, we demonstrate that the features extracted from the novel AVF-RN can further improve the performance measured on the R-CBIR task, reaching the state-of-the-art on the proposed dataset.
@article{DBLP:journals/ijmir/MessinaACFG20,
  author    = {Nicola Messina and
               Giuseppe Amato and
               Fabio Carrara and
               Fabrizio Falchi and
               Claudio Gennaro},
  title     = {Learning visual features for relational {CBIR}},
  journal   = {Int. J. Multim. Inf. Retr.},
  volume    = {9},
  number    = {2},
  pages     = {113--124},
  year      = {2020},
  url       = {https://doi.org/10.1007/s13735-019-00178-7},
  doi       = {10.1007/s13735-019-00178-7},
  timestamp = {Fri, 18 Nov 2022 17:12:34 +0100},
  biburl    = {https://dblp.org/rec/journals/ijmir/MessinaACFG20.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

 Re-ranking via local embeddings: A use case with permutation-based indexing and the nSimplex projection

L. Vadicamo, C. Gennaro, F. Falchi, E. Chávez, R. Connor, G. Amato

In Information Systems, vol. 95, January 2021.

ISSN: 0306-4379, Online ISSN: 1873-6076
Scopus: 2-s2.0-85079425963&, WOS: 000581494100003, DOI: 10.1016/j.is.2020.101506

Approximate Nearest Neighbor (ANN) search is a prevalent paradigm for searching intrinsically high dimensional objects in large-scale data sets. Recently, the permutation-based approach for ANN has attracted a lot of interest due to its versatility in being used in the more general class of metric spaces. In this approach, the entire database is ranked by a permutation distance to the query. Typically, permutations allow the efficient selection of a candidate set of results, but typically to achieve high recall or precision this set has to be reviewed using the original metric and data. This can lead to a sizeable percentage of the database being recalled, along with many expensive distance calculations. To reduce the number of metric computations and the number of database elements accessed, we propose here a re-ranking based on a local embedding using the nSimplex projection. The nSimplex projection produces Euclidean vectors from objects in metric spaces which possess the n-point property. The mapping is obtained from the distances to a set of reference objects, and the original metric can be lower bounded and upper bounded by the Euclidean distance of objects sharing the same set of references. Our approach is particularly advantageous for extensive databases or expensive metric function. We reuse the distances computed in the permutations in the first stage, and hence the memory footprint of the index is not increased. An extensive experimental evaluation of our approach is presented, demonstrating excellent results even on a set of hundreds of millions of objects.
@article{VADICAMO2021101506,
title = {Re-ranking via local embeddings: A use case with permutation-based indexing and the nSimplex projection},
journal = {Information Systems},
volume = {95},
pages = {101506},
year = {2021},
issn = {0306-4379},
doi = {https://doi.org/10.1016/j.is.2020.101506},
url = {https://www.sciencedirect.com/science/article/pii/S030643792030017X},
author = {Lucia Vadicamo and Claudio Gennaro and Fabrizio Falchi and Edgar Chávez and Richard Connor and Giuseppe Amato},
keywords = {Metric search, Permutation-based indexing, n-point property, nSimplex projection, Metric local embeddings, Distance bounds},
}

 Cross-resolution face recognition adversarial attacks

F.V. Massoli, F. Falchi, G. Amato

In Pattern Recognition Letters, vol. 140, December 2020, p. 222-229.

ISSN: 0167-8655
DOI: 10.1016/j.patrec.2020.10.008

Fabio Valerio Massoli and Fabrizio Falchi and Giuseppe Amato", keywords = "Deep learning, Face recognition, Adversarial attacks, Face identification, Adversarial biometrics", abstract = "Face Recognition is among the best examples of computer vision problems where the supremacy of deep learning techniques compared to standard ones is undeniable. Unfortunately, it has been shown that they are vulnerable to adversarial examples - input images to which a human imperceptible perturbation is added to lead a learning model to output a wrong prediction. Moreover, in applications such as biometric systems and forensics, cross-resolution scenarios are easily met with a non-negligible impact on the recognition performance and adversary’s success. Despite the existence of such vulnerabilities set a harsh limit to the spread of deep learning-based face recognition systems to real-world applications, a comprehensive analysis of their behavior when threatened in a cross-resolution setting is missing in the literature. In this context, we posit our study, where we harness several of the strongest adversarial attacks against deep learning-based face recognition systems considering the cross-resolution domain. To craft adversarial instances, we exploit attacks based on three different metrics, i.e., L1, L2, and L∞, and we study the resilience of the models across resolutions. We then evaluate the performance of the systems against the face identification protocol, open- and close-set. In our study, we find that the deep representation attacks represents a much dangerous menace to a face recognition system than the ones based on the classification output independently from the used metric. Furthermore, we notice that the input image’s resolution has a non-negligible impact on an adversary’s success in deceiving a learning model. Finally, by comparing the performance of the threatened networks under analysis, we show how they can benefit from a cross-resolution training approach in terms of resilience to adversarial attacks.
@article{MASSOLI2020222,
title = "Cross-resolution face recognition adversarial attacks",
journal = "Pattern Recognition Letters",
volume = "140",
pages = "222 - 229",
year = "2020",
issn = "0167-8655",
doi = "https://doi.org/10.1016/j.patrec.2020.10.008",
url = "http://www.sciencedirect.com/science/article/pii/S0167865520303950",
}


 About Deep Learning, Intuition and Thinking

F. Falchi

ERCIM News 116 - Special theme: Transparency in Algorithmic Decision Making.

WOS: 000456690000009

“Intuition is nothing more and nothing less than recognition”, is a famous quote by Herbert Simon, who received the Turing Award in 1975 and the Nobel Prize in 1978. As explained by Daniel Kahneman, another Nobel Prize winner, in his book Thinking, Fast and Slow, and during his talk at Google in 2011: “There is really no difference between the physician recognising a particular disease from a facial expression and a little child learning, pointing to something and saying doggie. The little child has no idea what the clues are but he just said, he just knows this is a dog without knowing why he knows”. These milestones should be used as a guideline to help understanding decision making in recent AI algorithms and thus their transparency.
@Article{FalchiERCIMNews2019,
author="Falchi, Fabrizio",
title="About Deep Learning, Intuition and Thinking",
journal="ERCIM News",
year="2019",
month="January",
issn="0926-4981",
url="https://ercim-news.ercim.eu/images/stories/EN116/EN116-web.pdf"
}


 Exploiting CNN Layer Activations to Improve Adversarial Image Classification

R. Caldelli, R. Becarelli, F. Carrara, F. Falchi, G. Amato

2019 IEEE International Conference on Image Processing (ICIP) Taipei, 22-25 September 2019

Volume 2019-September, Pages 2289-2293, article number 880377626th ICIP 2019Code 155696 Received the Best Paper Award. ISSN: 1522-4880 , ISBN: 978-153866249-6 , Scopus: 2-s2.0-85076808262 , DOI: 10.1109/ICIP.2019.8803776

Neural networks are now used in many sectors of our daily life thanks to efficient solutions such instruments provide for diverse tasks. Leaving to artificial intelligence the chance to make choices on behalf of humans inevitably exposes these tools to be fraudulently attacked. In fact, adversarial examples, intentionally crafted to fool a neural network, can dangerously induce a misclassification though appearing innocuous for a human observer. On such a basis, this paper focuses on the problem of image classification and proposes an analysis to better insight what happens inside a convolutional neural network (CNN) when it evaluates an adversarial example. In particular, the activations of the internal network layers have been analyzed and exploited to design possible countermeasures to reduce CNN vulnerability. Experimental results confirm that layer activations can be adopted to detect adversarial inputs.
@INPROCEEDINGS{8803776,
	author={R. {Caldelli} and R. {Becarelli} and F. {Carrara} and F. {Falchi} and G. {Amato}},
	booktitle={2019 IEEE International Conference on Image Processing (ICIP)},
	title={Exploiting CNN Layer Activations to Improve Adversarial Image Classification},
	year={2019},
	volume={},
	number={},
	pages={2289-2293},
	keywords={Adversarial images;neural networks;layer activations;adversarial detection},
	doi={10.1109/ICIP.2019.8803776},
	ISSN={2381-8549},
	month={Sep.}
}


 Large-scale instance-level image retrieval

G. Amato, F. Carrara, F. Falchi, C. Gennaro, L. Vadicamo

Information Processing & Management special issue on Deep Learning for Information Retrieval.

Volume 57, Issue 6. November 2020. Article number 102100
ISSN: 0306-4573 , WOS: 000582206800004 , Scopus: 2-s2.0-85071397229, , DOI: 10.1016/j.ipm.2019.102100>

The great success of visual features learned from deep neural networks has led to a significant effort to develop efficient and scalable technologies for image retrieval. Nevertheless, its usage in large-scale Web applications of content-based retrieval is still challenged by their high dimensionality. To overcome this issue, some image retrieval systems employ the product quantization method to learn a large-scale visual dictionary from a training set of global neural network features. These approaches are implemented in main memory, preventing their usage in big-data applications. The contribution of the work is mainly devoted to investigating some approaches to transform neural network features into text forms suitable for being indexed by a standard full-text retrieval engine such as Elasticsearch. The basic idea of our approaches relies on a transformation of neural network features with the twofold aim of promoting the sparsity without the need of unsupervised pre-training. We validate our approach on a recent convolutional neural network feature, namely Regional Maximum Activations of Convolutions (R-MAC), which is a state-of-art descriptor for image retrieval. Its effectiveness has been proved through several instance-level retrieval benchmarks. An extensive experimental evaluation conducted on the standard benchmarks shows the effectiveness and efficiency of the proposed approach and how it compares to state-of-the-art main-memory indexes.
				@article{amato2019large,
				 title={Large-scale instance-level image retrieval},
				 author={Amato, Giuseppe and Carrara, Fabio and Falchi, Fabrizio and Gennaro, Claudio and Vadicamo, Lucia},
				 journal={Information Processing \& Management},
				 pages={102100},
				 year={2019},
				 doi={10.1016/j.ipm.2019.102100},
				 publisher={Elsevier}
				}

 Learning Safety Equipment Detection using Virtual Worlds

M. Di Benedetto, E. Meloni, G. Amato, F. Falchi, C. Gennaro

International Conference on Content-Based Multimedia Indexing (CBMI)
Dublin, Ireland. 4-6 Sept 2019
Springer

Received the Best Paper Award. ISSN: 1949-3983 , ISBN: 978-1-7281-4674-4 , eISSN: 1949-3991 , eISBN: 978-1-7281-4673-7 , Scopus: 2-s2.0-85074336647 , DOI: 10.1109/CBMI.2019.8877466


Developing abstract reasoning abilities in neural networks is an important goal towards the achievement of human-like performances on many tasks. As of now, some works have tackled this problem, developing ad-hoc architectures and reaching overall good generalization performances. In this work we try to understand to what extent state-of-the-art convolutional neural networks for image classification are able to deal with a challenging abstract problem, the so-called same-different task. This problem consists in understanding if two random shapes inside the same image are the same or not. A recent work demonstrated that simple convolutional neural networks are almost unable to solve this problem. We extend their work, showing that ResNet-inspired architectures are able to learn, while VGG cannot converge. In light of this, we suppose that residual connections have some important role in the learning process, while the depth of the network seems not so relevant. In addition, we carry out some targeted tests on the converged architectures to figure out to what extent they are able to generalize to never seen patterns. However, further investigation is needed in order to understand what are the architectural peculiarities and limits as far as abstract reasoning is concerned.
@INPROCEEDINGS{8877412,
  author={Messina, Nicola and Amato, Giuseppe and Carrara, Fabio and Falchi, Fabrizio and Gennaro, Claudio},
  booktitle={2019 International Conference on Content-Based Multimedia Indexing (CBMI)}, 
  title={Testing Deep Neural Networks on the Same-Different Task}, 
  year={2019},
  volume={},
  number={},
  pages={1-6},
  doi={10.1109/CBMI.2019.8877412}}

 Testing Deep Neural Networks on the Same-Different Task

N. Messina, G. Amato, F. Carrara, F. Falchi, C. Gennaro

International Conference on Content-Based Multimedia Indexing (CBMI)
Dublin, Ireland. 4-6 Sept 2019
Springer

ISSN: 1949-3983 , ISBN: 978-1-7281-4674-4 , eISSN: 1949-3991 , eISBN: 978-1-7281-4673-7 , Scopus: 2-s2.0-85074355099 , DOI: 10.1109/CBMI.2019.8877412

Developing abstract reasoning abilities in neural networks is an important goal towards the achievement of human-like performances on many tasks. As of now, some works have tackled this problem, developing ad-hoc architectures and reaching overall good generalization performances. In this work we try to understand to what extent state-of-The-Art convolutional neural networks for image classification are able to deal with a challenging abstract problem, the so-called same-different task. This problem consists in understanding if two random shapes inside the same image are the same or not. A recent work demonstrated that simple convolutional neural networks are almost unable to solve this problem. We extend their work, showing that ResNet-inspired architectures are able to learn, while VGG cannot converge. In light of this, we suppose that residual connections have some important role in the learning process, while the depth of the network seems not so relevant. In addition, we carry out some targeted tests on the converged architectures to figure out to what extent they are able to generalize to never seen patterns. However, further investigation is needed in order to understand what are the architectural peculiarities and limits as far as abstract reasoning is concerned. © 2019 IEEE.
@INPROCEEDINGS{8877412,
  author={Messina, Nicola and Amato, Giuseppe and Carrara, Fabio and Falchi, Fabrizio and Gennaro, Claudio},
  booktitle={2019 International Conference on Content-Based Multimedia Indexing (CBMI)}, 
  title={Testing Deep Neural Networks on the Same-Different Task}, 
  year={2019},
  volume={},
  number={},
  pages={1-6},
  doi={10.1109/CBMI.2019.8877412}}

 On the Robustness to Adversarial Examples of Neural ODE Image Classifiers

F.Carrara, R. Caldelli, F. Falchi, G. Amato

In Proceedings of 2019 IEEE International Workshop on Information Forensics and Security (WIFS 2019)

ISSN: 2157-4774, ISBN: 978-172813217-4
Scopus: 2-s2.0-85083024037, DOI: 10.1109/WIFS47025.2019.9035109

The vulnerability of deep neural networks to adversarial attacks currently represents one of the most challenging open problems in the deep learning field. The NeurIPS 2018 work that obtained the best paper award proposed a new paradigm for defining deep neural networks with continuous internal activations. In this kind of networks, dubbed Neural ODE Networks, a continuous hidden state can be defined via parametric ordinary differential equations, and its dynamics can be adjusted to build representations for a given task, such as image classification. In this paper, we analyze the robustness of image classifiers implemented as ODE Nets to adversarial attacks and compare it to standard deep models. We show that Neural ODE are natively more robust to adversarial attacks with respect to state-of-the-art residual networks, and some of their intrinsic properties, such as adaptive computation cost, open new directions to further increase the robustness of deep-learned models. Moreover, thanks to the continuity of the hidden state, we are able to follow the perturbation injected by manipulated inputs and pinpoint the part of the internal dynamics that is most responsible for the misclassification.

@misc{carrara2019wifs,
 title={On the Robustness to Adversarial Examples of Neural ODE Image Classifiers},
 author={Fabio Carrara, Roberto Caldelli, Fabrizio Falchi, Giuseppe Amato},
 year={2019},
 booktitle="2019 IEEE International Workshop on Information Forensics and Security (WIFS 2019)",
 year="2019",
 publisher="IEEE",
}

 Learning Pedestrian Detection from Virtual Worlds

G. Amato, L. Ciampi, F. Falchi, C. Gennaro, N. Messina

In Image Analysis and Processing - ICIAP 2019
20th International Conference, Trento, Italy, September 9–13, 2019, Proceedings, Part I, pages 302-312.

Lecture Notes in Computer Scienve, vol. 11751
Also part of the Image Processing, Computer Vision, Pattern Recognition, and Graphics book sub series (LNIP, volume 11751)
ISSN: 0302-9743, eISSN: 1611-3349, ISBN: 978-3-030-30641-0, eISBN: 978-3-030-30642-7
SCOPUS: 2-s2.0-85072973976, DOI: 10.1007/978-3-030-30642-7_27

In this paper, we present a real-time pedestrian detection system that has been trained using a virtual environment. This is a very popular topic of research having endless practical applications and recently, there was an increasing interest in deep learning architectures for performing such a task. However, the availability of large labeled datasets is a key point for an effective train of such algorithms. For this reason, in this work, we introduced ViPeD, a new synthetically generated set of images extracted from a realistic 3D video game where the labels can be automatically generated exploiting 2D pedestrian positions extracted from the graphics engine. We exploited this new synthetic dataset fine-tuning a state-of-the-art computationally efficient Convolutional Neural Network (CNN). A preliminary experimental evaluation, compared to the performance of other existing approaches trained on real-world images, shows encouraging results.
@InProceedings{10.1007/978-3-030-30642-7_27,
 author="Amato, Giuseppe and Ciampi, Luca and Falchi, Fabrizio and Gennaro, Claudio and Messina, Nicola",
 editor="Ricci, Elisa and Rota Bul{\`o}, Samuel and Snoek, Cees and Lanz, Oswald and Messelodi, Stefano and Sebe, Nicu",
 title="Learning Pedestrian Detection from Virtual Worlds",
 booktitle="Image Analysis and Processing -- ICIAP 2019",
 year="2019",
 publisher="Springer International Publishing",
 address="Cham",
 pages="302--312",
 isbn="978-3-030-30642-7"
}

 Hebbian Learning Meets Deep Learning

G. Amato, F. Carrara, F. Falchi, C. Gennaro, G. Lagani

In Image Analysis and Processing - ICIAP 2019
20th International Conference, Trento, Italy, September 9–13, 2019, Proceedings, Part I, pages 324-334.

Lecture Notes in Computer Scienve, vol. 11751
Also part of the Image Processing, Computer Vision, Pattern Recognition, and Graphics book sub series (LNIP, volume 11751)
ISSN: 0302-9743, eISSN: 1611-3349, ISBN: 978-3-030-30641-0, eISBN: 978-3-030-30642-7
SCOPUS: 2-s2.0-85072975843, DOI: 10.1007/978-3-030-30642-7_29

Neural networks are said to be biologically inspired since they mimic the behavior of real neurons. However, several processes in state-of-the-art neural networks, including Deep Convolutional Neural Networks (DCNN), are far from the ones found in animal brains. One relevant difference is the training process. In state-of-the-art artificial neural networks, the training process is based on backpropagation and Stochastic Gradient Descent (SGD) optimization. However, studies in neuroscience strongly suggest that this kind of processes does not occur in the biological brain. Rather, learning methods based on Spike-Timing-Dependent Plasticity (STDP) or the Hebbian learning rule seem to be more plausible, according to neuroscientists. In this paper, we investigate the use of the Hebbian learning rule when training Deep Neural Networks for image classification by proposing a novel weight update rule for shared kernels in DCNNs. We perform experiments using the CIFAR-10 dataset in which we employ Hebbian learning, along with SGD, to train parts of the model or whole networks for the task of image classification, and we discuss their performance thoroughly considering both effectiveness and efficiency aspects.
@InProceedings{10.1007/978-3-030-30642-7_29,
author="Amato, Giuseppe and Carrara, Fabio and Falchi, Fabrizio and Gennaro, Claudio and Lagani, Gabriele",
editor="Ricci, Elisa and Rota Bul{\`o}, Samuel and Snoek, Cees and Lanz, Oswald and Messelodi, Stefano and Sebe, Nicu",
title="Hebbian Learning Meets Deep Convolutional Neural Networks",
booktitle="Image Analysis and Processing -- ICIAP 2019",
year="2019",
publisher="Springer International Publishing",
address="Cham",
pages="324--334",
isbn="978-3-030-30642-7"
}

 Evaluation of continuous image features learned by ODE Nets

F. Carrara, G. Amato, F. Falchi, C. Gennaro

In Image Analysis and Processing - ICIAP 2019
20th International Conference, Trento, Italy, September 9–13, 2019, Proceedings, Part I, pages 432-442.

Lecture Notes in Computer Scienve, vol. 11751
Also part of the Image Processing, Computer Vision, Pattern Recognition, and Graphics book sub series (LNIP, volume 11751)
ISSN: 0302-9743, eISSN: 1611-3349, ISBN: 978-3-030-30641-0, eISBN: 978-3-030-30642-7
SCOPUS: 2-s2.0-85072966872, DOI: 10.1007/978-3-030-30642-7_39

Deep-learning approaches in data-driven modeling relies on learning a finite number of transformations (and representations) of the data that are structured in a hierarchy and are often instantiated as deep neural networks (and their internal activations). State-of-the-art models for visual data usually implement deep residual learning: the network learns to predict a finite number of discrete updates that are applied to the internal network state to enrich it. Pushing the residual learning idea to the limit, ODE Net—a novel network formulation involving continuously evolving internal representations that gained the best paper award at NeurIPS 2018—has been recently proposed. Differently from traditional neural networks, in this model the dynamics of the internal states are defined by an ordinary differential equation with learnable parameters that defines a continuous transformation of the input representation. These representations can be computed using standard ODE solvers, and their dynamics can be steered to learn the input-output mapping by adjusting the ODE parameters via standard gradient-based optimization. In this work, we investigate the image representation learned in the continuous hidden states of ODE Nets. In particular, we train image classifiers including ODE-defined continuous layers and perform preliminary experiments to assess the quality, in terms of transferability and generality, of the learned image representations and compare them to standard representation extracted from residual networks. Experiments on CIFAR-10 and Tiny-ImageNet-200 datasets show that representations extracted from ODE Nets are more transferable and suggest an improved robustness to overfit.
@InProceedings{10.1007/978-3-030-30642-7_39,
author="Carrara, Fabio and Amato, Giuseppe and Falchi, Fabrizio and Gennaro, Claudio",
editor="Ricci, Elisa and Rota Bul{\`o}, Samuel and Snoek, Cees and Lanz, Oswald and Messelodi, Stefano and Sebe, Nicu",
title="Evaluation of Continuous Image Features Learned by ODE Nets",
booktitle="Image Analysis and Processing -- ICIAP 2019",
year="2019",
publisher="Springer International Publishing",
address="Cham",
pages="432--442",
isbn="978-3-030-30642-7",
doi="10.1007/978-3-030-30642-7_39"
}

 Improving Multi-Scale Face Recognition using VGGFace2

F.V. Massoli, F. Falchi, G. Amato, C. Gennaro, C. Vairo

In New Trends in Image Analysis and Processing - ICIAP 2019, ICIAP International Workshops, BioFor, PatReCH, e-BADLE, DeepRetail, and Industrial Session, Trento, Italy, September 9–10, 2019, Revised Selected Papers, pages 21-29.

Lecture Notes in Computer Scienve, vol. 11808
Also part of the Image Processing, Computer Vision, Pattern Recognition, and Graphics book sub series (LNIP, volume 11808)
ISSN: 0302-9743, sISSN: 1611-3349, ISBN: 978-3-030-30753-0, eISBN: 978-3-030-30754-7
SCOPU: 2-s2.0-85072862213, DOI: 10.1007/978-3-030-30754-7_3

Convolutional neural networks have reached extremely high performances on the Face Recognition task. These models are commonly trained by using high-resolution images and for this reason, their discrimination ability is usually degraded when they are tested against low-resolution images. Thus, Low-Resolution Face Recognition remains an open challenge for deep learning models. Such a scenario is of particular interest for surveillance systems in which it usually happens that a low-resolution probe has to be matched with higher resolution galleries. This task can be especially hard to accomplish since the probe can have resolutions as low as 8, 16 and 24 pixels per side while the typical input of state-of-the-art neural network is 224. In this paper, we described the training campaign we used to fine-tune a ResNet-50 architecture, with Squeeze-and-Excitation blocks, on the tasks of very low and mixed resolutions face recognition. For the training process we used the VGGFace2 dataset and then we tested the performance of the final model on the IJB-B dataset; in particular, we tested the neural network on the 1:1 verification task. In our experiments we considered two different scenarios: (1) probe and gallery with same resolution; (2) probe and gallery with mixed resolutions. Experimental results show that with our approach it is possible to improve upon state-of-the-art models performance on the low and mixed resolution face recognition tasks with a negligible loss at very high resolutions.
@InProceedings{10.1007/978-3-030-30754-7_3,
 author="Massoli, Fabio Valerio and Amato, Giuseppe and Falchi, Fabrizio and Gennaro, Claudio and Vairo, Claudio",
 editor="Cristani, Marco and Prati, Andrea and Lanz, Oswald
 and Messelodi, Stefano and Sebe, Nicu",
 title="Improving Multi-scale Face Recognition Using VGGFace2",
 booktitle="New Trends in Image Analysis and Processing -- ICIAP 2019",
 year="2019",
 publisher="Springer International Publishing",
 address="Cham",
 pages="21--29",
 isbn="978-3-030-30754-7"
}

 Metric Embedding into the Hamming Space with the n-Simplex Projection

L. Vadicamo, V. Mic, F. Falchi, P. Zezula

In Similarity Search and Applications
12th International Conference, SISAP 2019, Newark, NJ, USA, October 2–4, 2019, Proceedings.

Lecture Notes in Computer Scienve, vol. 11807
Also part of the Information Systems and Applications, incl. Internet/Web, and HCI book sub series (LNISA, volume 11807)
Additional material for the paper is available at the following link
ISSN: 0302-9743, eISSN: 1611-3349, ISBN: 978-3-030-32046-1, eISBN: 978-3-030-32047-8
Scopus: 2-s2.0-85076087049, DOI: 10.1007/978-3-030-32047-8_23

Transformations of data objects into the Hamming space are often exploited to speed-up the similarity search in metric spaces. Techniques applicable in generic metric spaces require expensive learning, e.g., selection of pivoting objects. However, when searching in common Euclidean space, the best performance is usually achieved by transformations specifically designed for this space. We propose a novel transformation technique that provides a good trade-off between the applicability and the quality of the space approximation. It uses the n-Simplex projection to transform metric objects into a low-dimensional Euclidean space, and then transform this space to the Hamming space. We compare our approach theoretically and experimentally with several techniques of the metric embedding into the Hamming space. We focus on the applicability, learning cost, and the quality of search space approximation.
@InProceedings{10.1007/978-3-030-32047-8_23,
author="Vadicamo, Lucia and Mic, Vladimir and Falchi, Fabrizio and Zezula, Pavel",
editor="Amato, Giuseppe and Gennaro, Claudio and Oria, Vincent and Radovanovi{\'{c}} , Milo{\v{s}}",
title="Metric Embedding into the Hamming Space with the n-Simplex Projection",
booktitle="Similarity Search and Applications",
year="2019",
publisher="Springer International Publishing",
address="Cham",
pages="265--272",
doi="10.1007/978-3-030-32047-8_23",
isbn="978-3-030-32047-8"
}

 SPLX-Perm: A Novel Permutation-Based Representation for Approximate Metric Search

L. Vadicamo, R. Connor, F. Falchi, C. Gennaro, F. Rabitti

In Similarity Search and Applications
12th International Conference, SISAP 2019, Newark, NJ, USA, October 2–4, 2019, Proceedings.

Lecture Notes in Computer Scienve, vol. 11807
Also part of the Information Systems and Applications, incl. Internet/Web, and HCI book sub series (LNISA, volume 11807)
ISSN: 0302-9743, eISSN: 1611-3349, ISBN: 978-3-030-32046-1, eISBN: 978-3-030-32047-8
Scopus: 2-s2.0-85076117598, DOI: 10.1007/978-3-030-32047-8_4

Many approaches for approximate metric search rely on a permutation-based representation of the original data objects. The main advantage of transforming metric objects into permutations is that the latter can be efficiently indexed and searched using data structures such as inverted-files and prefix trees. Typically, the permutation is obtained by ordering the identifiers of a set of pivots according to their distances to the object to be represented. In this paper, we present a novel approach to transform metric objects into permutations. It uses the object-pivot distances in combination with a metric transformation, called n-Simplex projection. The resulting permutation-based representation, named SPLX-Perm, is suitable only for the large class of metric space satisfying the n-point property. We tested the proposed approach on two benchmarks for similarity search. Our preliminary results are encouraging and open new perspectives for further investigations on the use of the n-Simplex projection for supporting permutation-based indexing.
@InProceedings{10.1007/978-3-030-32047-8_23,
author="Vadicamo, Lucia and Mic, Vladimir and Falchi, Fabrizio and Zezula, Pavel",
editor="Amato, Giuseppe and Gennaro, Claudio and Oria, Vincent and Radovanovi{\'{c}} , Milo{\v{s}}",
title="Metric Embedding into the Hamming Space with the n-Simplex Projection",
booktitle="Similarity Search and Applications",
year="2019",
publisher="Springer International Publishing",
address="Cham",
pages="265--272",
doi="10.1007/978-3-030-32047-8_23",
isbn="978-3-030-32047-8"
}

 An Image Retrieval System for Video

P. Bolettieri, F. Carrara, F. Debole, F. Falchi, C. Gennaro, L. Vadicamo, C. Vairo

In Similarity Search and Applications
12th International Conference, SISAP 2019, Newark, NJ, USA, October 2–4, 2019, Proceedings.

Lecture Notes in Computer Science, vol. 11807
Also part of the Information Systems and Applications, incl. Internet/Web, and HCI book sub series (LNISA, volume 11807)
ISSN: 0302-9743, eISSN: 1611-3349, ISBN: 978-3-030-32046-1, eISBN: 978-3-030-32047-8
Scopus: 2-s2.0-85076117082, DOI: 10.1007/978-3-030-32047-8_29

Since the 1970's the Content-Based Image Indexing and Retrieval (CBIR) has been an active area. Nowadays, the rapid increase of video data has paved the way to the advancement of the technologies in many different communities for the creation of Content-Based Video Indexing and Retrieval (CBVIR). However, greater attention needs to be devoted to the development of effective tools for video search and browse. In this paper, we present Visione, a system for large-scale video retrieval. The system integrates several content-based analysis and retrieval modules, including a keywords search, a spatial object-based search, and a visual similarity search. From the tests carried out by users when they needed to find as many correct examples as possible, the similarity search proved to be the most promising option. Our implementation is based on state-of-the-art deep learning approaches for content analysis and leverages highly efficient indexing techniques to ensure scalability. Specifically, we encode all the visual and textual descriptors extracted from the videos into (surrogate) textual representations that are then efficiently indexed and searched using an off-the-shelf text search engine using similarity functions.
@InProceedings{10.1007/978-3-030-32047-8_29,
author="Bolettieri, Paolo and Carrara, Fabio and Debole, Franca and Falchi, Fabrizio and Gennaro, Claudio and Vadicamo, Lucia and Vairo, Claudio",
editor="Amato, Giuseppe and Gennaro, Claudio and Oria, Vincent
and Radovanovi{\'{c}} , Milo{\v{s}}",
title="An Image Retrieval System for Video",
booktitle="Similarity Search and Applications",
year="2019",
publisher="Springer International Publishing",
address="Cham",
pages="332--339",
doi="10.1007/978-3-030-32047-8_29",
isbn="978-3-030-32047-8"
}

 Intelligenza Artificiale per Ricerca in Big Multimedia Data

F. Carrara, G. Amato, F. Debole, M. Di Benedetto, F. Falchi, C. Gennaro, N. Messina

In Ital-IA CINI Conference on Artificial Intelligence, March, 18-19, 2019, Rome, Italy:

La diffusa produzione di immagini e media digitali ha reso necessario l’utilizzo di metodi automatici di analisi e indicizzazione su larga scala per la loro fruzione. Il gruppo AIMIR dell’ISTI-CNR si è specializzato da anni in questo ambito ed ha abbracciato tecniche di Deep Learning basate su reti neurali artificiali per molteplici aspetti di questa disciplina, come l’analisi, l’annotazione e la descrizione automatica di contenuti visuali e il loro recupero su larga scala.

@InProceedings{2019Ital-IA-Multimedia,
author="Carrara, Fabio and Amato, Giuseppe and Debole, Franca and Di Benedetto, Marco and Falchi, Fabrizio and Gennaro, Claudio and Messina, Nicola",
title="Intelligenza Artificiale per Ricerca in Big Multimedia Data",
booktitle="Ital-IA",
year="2019",
publisher="CINI",
url="http://www.ital-ia.it/submission/54/paper",
}

 Intelligenza Artificiale e Analisi Visuale per la Cyber Security

C. Vairo, G. Amato, L. Ciampi, F. Falchi, C. Gennaro, F.V. Massoli

In Ital-IA CINI Conference on Artificial Intelligence, March, 18-19, 2019, Rome, Italy:

Negli ultimi anni la Cyber Security ha acquisito una connotazione sempre più vasta, andando oltre la concezione di semplice sicurezza dei sistemi informatici e includendo anche la sorveglianza e la sicurezza in senso lato, sfruttando le ultime tecnologie come ad esempio l’intelligenza artificiale. In questo contributo vengono presentate le principali attività di ricerca e alcune delle tecnologie utilizzate e sviluppate dal gruppo di ricerca AIMIR dell’ISTI-CNR, e viene fornita una panoramica dei progetti di ricerca, sia passati che attualmente attivi, in cui queste tecnologie di intelligenza artificiale vengono utilizzare per lo sviluppo di applicazioni e servizi per la Cyber Security.
@InProceedings{2019Ital-IA-CS,
author="Vairo, Claudio and Amato, Giuseppe and Ciampi, Luca and Falchi, Fabrizio and Gennaro, Claudio and Massoli, Fabio Valerio",
title="Intelligenza Artificiale e Analisi Visuale per la Cyber Security",
booktitle="Ital-IA",
year="2019",
publisher="CINI",
url="http://www.ital-ia.it/submission/36/paper",
}

 Intelligenza Artificiale, Retrieval e Beni Culturali

L. Vadicamo, G. Amato, P. Bolettieri, F. Falchi, C. Gennaro, F. Rabitti

In Ital-IA CINI Conference on Artificial Intelligence, March, 18-19, 2019, Rome, Italy:

La visita a musei o a luoghi di interesse di città d’arte può essere completamente reinventata attraverso modalità di fruizione moderne e dinamiche, basate su tecnologie di riconoscimento e localizzazione visuale, ricerca per immagini e visualizzazioni in realtà aumentata. Da anni il gruppo di ricerca AIMIR porta avanti attività di ricerca su queste tematiche ricoprendo anche ruoli di responsabilità in progetti nazionali ed internazionali. Questo contributo riassume alcune delle attività di ricerca svolte e delle tecnologie utilizzate, nonché la partecipazione a progetti che hanno utilizzato tecnologie di intelligenza artificiale per la valorizzazione e la fruizione del patrimonio culturale.
@InProceedings{2019Ital-IA-CH,
author="Vadicamo, Lucia and Amato, Giuseppe and Bolettieri, Paolo and Falchi, Fabrizio and Gennaro, Claudio and Rabitti, Fausto",
title="Intelligenza Artificiale, Retrieval e Beni Culturali",
booktitle="Ital-IA",
year="2019",
publisher="CINI",
url="http://www.ital-ia.it/submission/33/paper",
}

  Face Verification and Recognition for Digital Forensics and Information Security

G. Amato, F. Falchi, C. Gennaro, F.V. Massoli, N. Passalis, A. Tefas, A. Trivilini, C. Vairo

7th International Symposium on Digital Forensics and Security (ISDFS 2019)
June 10-12, 2019 - Barcelos Portugal.

ISBN: 978-1-7281-2828-3, eISBN: 978-1-7281-2827-6
WOS: 000490864900020, SCOPUS: 2-s2.0-85070520041 DOI: 0.1109/ISDFS.2019.8757511

In this paper, we present an extensive evaluation of face recognition and verification approaches performed by the European COST Action MULTI-modal Imaging of FOREnsic SciEnce Evidence (MULTI-FORESEE). The aim of the study is to evaluate various face recognition and verification methods, ranging from methods based on facial landmarks to state-of-the-art off-the-shelf pre-trained Convolutional Neural Networks (CNN), as well as CNN models directly trained for the task at hand. To fulfill this objective, we carefully designed and implemented a realistic data acquisition process, that corresponds to a typical face verification setup, and collected a challenging dataset to evaluate the real world performance of the aforementioned methods. Apart from verifying the effectiveness of deep learning approaches in a specific scenario, several important limitations are identified and discussed through the paper, providing valuable insight for future research directions in the field.
@INPROCEEDINGS{8757511,
author={G. {Amato} and F. {Falchi} and C. {Gennaro} and F. V. {Massoli} and N. {Passalis} and A. {Tefas} and A. {Trivilini} and C. {Vairo}},
booktitle={2019 7th International Symposium on Digital Forensics and Security (ISDFS)},
title={Face Verification and Recognition for Digital Forensics and Information Security},
year={2019},
volume={},
number={},
pages={1-6},
keywords={convolutional neural nets;data acquisition;digital forensics;face recognition;learning (artificial intelligence);digital forensics;information security;face recognition;European COST Action MULTImodal Imaging;FOREnsic SciEnce Evidence;MULTIFORESEE;verification methods;facial landmarks;CNN models;deep learning approaches;data acquisition process;face verification setup;convolutional neural networks;CNN;Forensics;Face Verification;Deep Learning;Surveillance;Security},
doi={10.1109/ISDFS.2019.8757511},
ISSN={},
month={June},}

  Counting Vehicles with Deep Learning in Onboard UAV Imagery

G. Amato, L. Ciampi, F. Falchi, C. Gennaro

IEEE Symposium on Computers and Communications proceedings (ISCC 2019)
June 30-July 3, 2019 - Barcellona, Spain.

Relational reasoning in Computer Vision has recently shown impressive results on visual question answering tasks. On the challenging dataset called CLEVR, the recently proposed Relation Network (RN), a simple plug-and-play module and one of the state-of-the-art approaches, has obtained a very good accuracy (95.5%) answering relational questions. In this paper, we define a sub-field of Content-Based Image Retrieval (CBIR) called Relational-CBIR (R-CBIR), in which we are interested in retrieving images with given relationships among objects. To this aim, we employ the RN architecture in order to extract relationaware features from CLEVR images. To prove the effectiveness of these features, we extended both CLEVR and Sort-of-CLEVR datasets generating a ground-truth for R-CBIR by exploiting relational data embedded into scene-graphs. Furthermore, we propose a modification of the RN module – a two-stage Relation Network (2S-RN) – that enabled us to extract relation-aware features by using a preprocessing stage able to focus on the image content, leaving the question apart. Experiments show that our RN features, especially the 2S-RN ones, outperform the RMAC state-of-the-art features on this new challenging task.

@INPROCEEDINGS{8757511,
author={G. {Amato} and L. {Ciampi} and F. {Falchi} and C. {Gennaro}},
booktitle={2019 IEEE Symposium on Computers and Communications proceedings (ISCC 2019)},
title={Counting Vehicles with Deep Learning in Onboard UAV Imagery},
year={2019},
note={to appear}

Learning Relationship-aware Visual Features

N. Messina, G. Amato, F. Carrara, F. Falchi, C. Gennaro

ECCV 2018 The European Conference on Computer Vision.
CEFRL 2018, 2nd Int. Workshop on Compact and Efficient Feature Representation and Learning in Computer Vision
Munich, Germany, 9 September 2018.

ISBN: 978-3-030-11017-8, eISBN: 978-3-030-11018-5, Scopus: DOI: 10.1007/978-3-030-11018-5_40

Relational reasoning in Computer Vision has recently shown impressive results on visual question answering tasks. On the challenging dataset called CLEVR, the recently proposed Relation Network (RN), a simple plug-and-play module and one of the state-of-the-art approaches, has obtained a very good accuracy (95.5%) answering relational questions. In this paper, we define a sub-field of Content-Based Image Retrieval (CBIR) called Relational-CBIR (R-CBIR), in which we are interested in retrieving images with given relationships among objects. To this aim, we employ the RN architecture in order to extract relationaware features from CLEVR images. To prove the effectiveness of these features, we extended both CLEVR and Sort-of-CLEVR datasets generating a ground-truth for R-CBIR by exploiting relational data embedded into scene-graphs. Furthermore, we propose a modification of the RN module – a two-stage Relation Network (2S-RN) – that enabled us to extract relation-aware features by using a preprocessing stage able to focus on the image content, leaving the question apart. Experiments show that our RN features, especially the 2S-RN ones, outperform the RMAC state-of-the-art features on this new challenging task.
@InProceedings{10.1007/978-3-030-11018-5_40,
author="Messina, Nicola
and Amato, Giuseppe
and Carrara, Fabio
and Falchi, Fabrizio
and Gennaro, Claudio",
editor="Leal-Taix{\'e}, Laura
and Roth, Stefan",
title="Learning Relationship-Aware Visual Features",
booktitle="Computer Vision -- ECCV 2018 Workshops",
year="2019",
publisher="Springer International Publishing",
address="Cham",
pages="486--501"
}

Adversarial examples detection in features distance spaces

F. Carrara, R. Becarelli, R. Caldelli, F. Falchi, G. Amato

ECCV 2018 The European Conference on Computer Vision.
WOCM 2018, Int. Workshop on Objectionable Content and. Misinformation in Computer Vision.
Munich, Germany, 8 September 2018.

ISBN: 978-3-030-11011-6, eISBN: 978-3-030-11012-3, Scopus: 2-s2.0-85061772510 DOI: 10.1007/978-3-030-11012-3_26

Maliciously manipulated inputs for attacking machine learning methods – in particular deep neural networks – are emerging as a relevant issue for the security of recent artificial intelligence technologies, especially in computer vision. In this paper, we focus on attacks targeting image classifiers implemented with deep neural networks, and we propose a method for detecting adversarial images which focuses on the trajectory of internal representations (i.e. hidden layers neurons activation, also known as deep features) from the very first, up to the last. We argue that the representations of adversarial inputs follow a different evolution with respect to genuine inputs, and we define a distance-based embedding of features to efficiently encode this information. We train an LSTM network that analyzes the sequence of deep features embedded in a distance space to detect adversarial examples. The results of our preliminary experiments are encouraging: our detection scheme is able to detect adversarial inputs targeted to the ResNet-50 classifier pre-trained on the ILSVRC’12 dataset and generated by a variety of crafting algorithms.
@InProceedings{10.1007/978-3-030-11012-3_26,
author="Carrara, Fabio
and Becarelli, Rudy
and Caldelli, Roberto
and Falchi, Fabrizio
and Amato, Giuseppe",
editor="Leal-Taix{\'e}, Laura
and Roth, Stefan",
title="Adversarial Examples Detection in Features Distance Spaces",
booktitle="Computer Vision -- ECCV 2018 Workshops",
year="2019",
publisher="Springer International Publishing",
address="Cham",
pages="313--327",
abstract="Maliciously manipulated inputs for attacking machine learning methods -- in particular deep neural networks -- are emerging as a relevant issue for the security of recent artificial intelligence technologies, especially in computer vision. In this paper, we focus on attacks targeting image classifiers implemented with deep neural networks, and we propose a method for detecting adversarial images which focuses on the trajectory of internal representations (i.e. hidden layers neurons activation, also known as deep features) from the very first, up to the last. We argue that the representations of adversarial inputs follow a different evolution with respect to genuine inputs, and we define a distance-based embedding of features to efficiently encode this information. We train an LSTM network that analyzes the sequence of deep features embedded in a distance space to detect adversarial examples. The results of our preliminary experiments are encouraging: our detection scheme is able to detect adversarial inputs targeted to the ResNet-50 classifier pre-trained on the ILSVRC'12 dataset and generated by a variety of crafting algorithms.",
isbn="978-3-030-11012-3"
}

Adversarial image detection in deep neural networks

F. Carrara, F. Falchi, R. Caldelli, G. Amato, R. Becarelli

Multimedia Tools and Applications (MTAP).
Vol. 78, Issue 3, 2019, pp. 2815-2835. First Online: 21 March 2018.

ISSN: 1380-7501, eISNN: 1573-7721
WOS: 000458171600010, SCOPUS: 2-s2.0-85044183184 DOI: 10.1007/s11042-018-5853-4

Deep neural networks are more and more pervading many computer vision applications and in particular image classification. Notwithstanding that, recent works have demonstrated that it is quite easy to create adversarial examples, i.e., images malevolently modified to cause deep neural networks to fail. Such images contain changes unnoticeable to the human eye but sufficient to mislead the network. This represents a serious threat for machine learning methods. In this paper, we investigate the robustness of the representations learned by the fooled neural network, analyzing the activations of its hidden layers. Specifically, we tested scoring approaches used for kNN classification, in order to distinguish between correctly classified authentic images and adversarial examples. These scores are obtained searching only between the very same images used for training the network. The results show that hidden layers activations can be used to reveal incorrect classifications caused by adversarial attacks.
@Article{Carrara2018,
 author="Carrara, Fabio and Falchi, Fabrizio and Caldelli, Roberto and Amato, Giuseppe and Becarelli, Rudy",
 title="Adversarial image detection in deep neural networks",
 journal="Multimedia Tools and Applications",
 year="2018",
 month="Mar",
 day="21",
 doi="10.1007/s11042-018-5853-4"
}

Picture it in your mind: Generating high level visual representations from textual descriptions

F. Carrara, A. Esuli, T. Fagni, F. Falchi, A.M. Fernandez

Information Retrieval Journal special issue on Neural Information Retrieval.
June 2018, Volume 21, Issue 2–3, pp 208–229

ISSN: 1573-7659, Scopus: 2-s2.0-85031427867, DOI: 10.1007/s10791-017-9318-6

In this paper we tackle the problem of image search when the query is a short textual description of the image the user is looking for. We choose to implement the actual search process as a similarity search in a visual feature space, by learning to translate a textual query into a visual representation. Searching in the visual feature space has the advantage that any update to the translation model does not require to reprocess the (typically huge) image collection on which the search is performed. We propose various neural network models of increasing complexity that learn to generate, from a short descriptive text, a high level visual representation in a visual feature space such as the pool5 layer of the ResNet-152 or the fc6–fc7 layers of an AlexNet trained on ILSVRC12 and Places databases. The TEXT2VIS models we explore include (1) a relatively simple regressor network relying on a bag-of-words representation for the textual descriptors, (2) a deep recurrent network that is sensible to word order, and (3) a wide and deep model that combines a stacked LSTM deep network with a wide regressor network. We compare the models we propose with other search strategies, also including textual search methods that exploit state-of-the-art caption generation models to index the image collection.
@Article{Carrara2018,
author="Carrara, Fabio
and Esuli, Andrea
and Fagni, Tiziano
and Falchi, Fabrizio
and Moreo Fern{\'a}ndez, Alejandro",
title="Picture it in your mind: generating high level visual representations from textual descriptions",
journal="Information Retrieval Journal",
year="2018",
month="Jun",
day="01",
volume="21",
number="2",
pages="208--229",
issn="1573-7659",
doi="10.1007/s10791-017-9318-6",
url="https://doi.org/10.1007/s10791-017-9318-6"
}

Large-Scale Image Retrieval with Elasticsearch

G. Amato, P. Bolettieri, F. Carrara, F. Falchi, C. Gennaro

SIGIR 2018: 41st International ACM SIGIR Conference on Research & Development in Information Retrieval
Ann Arbor Michigan, U.S.A. July 8-12, 2018
ACM New York, NY, USA , 925-928

ISBN: 978-1-4503-5657-2, WOS: 2-s2.0-85051465811, DOI: 10.1145/3209978.3210089

Content-Based Image Retrieval in large archives through the use of visual features has become a very attractive research topic in recent years. The cause of this strong impulse in this area of research is certainly to be attributed to the use of Convolutional Neural Network (CNN) activations as features and their outstanding performance. However, practically all the available image retrieval systems are implemented in main memory, limiting their applicability and preventing their usage in big-data applications. In this paper, we propose to transform CNN features into textual representations and index them with the well-known full-text retrieval engine Elasticsearch. We validate our approach on a novel CNN feature, namely Regional Maximum Activations of Convolutions. A preliminary experimental evaluation, conducted on the standard benchmark INRIA Holidays, shows the effectiveness and efficiency of the proposed approach and how it compares to state-of-the-art main-memory indexes.
@inproceedings{Amato:2018:LIR:3209978.3210089,
 author = {Amato, Giuseppe and Bolettieri, Paolo and Carrara, Fabio and Falchi, Fabrizio and Gennaro, Claudio},
 title = {Large-Scale Image Retrieval with Elasticsearch},
 booktitle = {The 41st International ACM SIGIR Conference on Research \&\#38; Development in Information Retrieval},
 series = {SIGIR '18},
 year = {2018},
 isbn = {978-1-4503-5657-2},
 location = {Ann Arbor, MI, USA},
 pages = {925--928},
 numpages = {4},
 url = {http://doi.acm.org/10.1145/3209978.3210089},
 doi = {10.1145/3209978.3210089},
 acmid = {3210089},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {elasticsearch, inverted index, product quantization: mean average precision, regional maximum activations of convolutions},
}

Distributed Video Surveillance Using Smart Cameras

H. Kavalionak, C. Gennaro, G. Amato, C. Vairo, C. Perciante, C. Meghini, F. Falchi

Journal of Grid Computing.
Vol. 17, Issue 1, 15 March 2019, pp. 59-77. First Online: 25 October 2018

ISSN: 1570-7873, eISSN: 1572-91484
WOS: 000464721600005, Scopus: 2-s2.0-85055954783, DOI: 10.1007/s10723-018-9467-x

Video surveillance systems have become an indispensable tool for the security and organization of public and private areas. Most of the current commercial video surveillance systems rely on a classical client/server architecture to perform face and object recognition. In order to support the more complex and advanced video surveillance systems proposed in the last years, companies are required to invest resources in order to maintain the servers dedicated to the recognition tasks. In this work, we propose a novel distributed protocol for a face recognition system that exploits the computational capabilities of the surveillance devices (i.e. cameras) to perform the recognition of the person. The cameras fall back to a centralized server if their hardware capabilities are not enough to perform the recognition. In order to evaluate the proposed algorithm we simulate and test the 1NN and weighted kNN classification algorithms via extensive experiments on a freely available dataset. As a prototype of surveillance devices we have considered Raspberry PI entities. By means of simulations, we show that our algorithm is able to reduce up to 50% of the load from the server with no negative impact on the quality of the surveillance service.
@Article{Kavalionak2018,
author="Kavalionak, Hanna
and Gennaro, Claudio
and Amato, Giuseppe
and Vairo, Claudio
and Perciante, Costantino
and Meghini, Carlo
and Falchi, Fabrizio",
title="Distributed Video Surveillance Using Smart Cameras",
journal="Journal of Grid Computing",
year="2018",
month="Oct",
day="25",
issn="1572-9184",
doi="10.1007/s10723-018-9467-x",
url="https://doi.org/10.1007/s10723-018-9467-x"
}

Cross-Media Learning for Image Sentiment Analysis in the Wild

L. Vadicamo, F. Carrara, A. Cimino, S. Cresci, F. Dell'Orletta, F. Falchi, M. Tesconi

5th Workshop on Web-scale Vision and Social Media (VSM), ICCV 2017
2017 IEEE International Conference on Computer Vision Workshops (ICCVW)

eISBN: 978-1-5386-1034-3, ISBN: 978-1-5386-1035-0, eISNN: 978-1-5386-1035-0,
WOS: 000425239600038 Scopus: 2-s2.0-85046302366 DOI: 10.1109/ICCVW.2017.45


Much progress has been made in the field of sentiment analysis in the past years. Researchers relied on textual data for this task, while only recently they have started investigating approaches to predict sentiments from multimedia content. With the increasing amount of data shared on social media, there is also a rapidly growing interest in approaches that work ``in the wild'', i.e. that are able to deal with uncontrolled conditions. In this work, we faced the challenge of training a visual sentiment classifier starting from a large set of user-generated and unlabeled contents. In particular, we collected more than 3 million tweets containing both text and images, and we leveraged on the sentiment polarity of the textual contents to train a visual sentiment classifier. To the best of our knowledge, this is the first time that a cross-media learning approach is proposed and tested in this context. We assessed the validity of our model by conducting comparative studies and evaluations on a benchmark for visual sentiment analysis.

@INPROCEEDINGS{8265255,
author={L. Vadicamo and F. Carrara and A. Cimino and S. Cresci and F. Dell'Orletta and F. Falchi and M. Tesconi},
booktitle={2017 IEEE International Conference on Computer Vision Workshops (ICCVW)},
title={Cross-Media Learning for Image Sentiment Analysis in the Wild},
year={2017},
volume={},
number={},
pages={308-317},
keywords={convolution;data visualisation;feedforward neural nets;image classification;learning (artificial intelligence);sentiment analysis;social networking (online);Tweets;cross-media learning;deep convolutional neural network training;image content;image sentiment analysis;multimedia content;social media;textual data;visual sentiment analysis;visual sentiment classifier;Feature extraction;Media;Sentiment analysis;Support vector machines;Twitter;Visualization},
doi={10.1109/ICCVW.2017.45},
ISSN={},
month={Oct}} 

Detecting adversarial example attacks to deep neural networks

F. Carrara, F. Falchi, R. Caldelli, G. Amato, R. Fumarola, R. Becarelli

Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing (CBMI 2017), article n. 38.

ISBN: 9781450353335, WOS: 000426964400038, Scopus: 2-s2.0-85030762303, DOI: 10.1145/3095713.3095753

Deep learning has recently become state-of-the-art in many computer vision applications and in image classification in particular. It is now a mature technology that can be used in several real-life tasks. However, it is possible to create adversarial examples, containing changes unnoticeable to humans, which cause an incorrect classification by a deep convolutional neural network. This represents a serious threat for machine learning methods. In this paper we investigate the robustness of the representations learned by the fooled neural network. Specifically, we use a kNN classifier over the activations of hidden layers of the convolutional neural network, in order to define a strategy for distinguishing between correctly classified authentic images and adversarial examples. The results show that hidden layers activations can be used to detect incorrect classifications caused by adversarial attacks.
@inproceedings{Carrara:2017:DAE:3095713.3095753,
 author = {Carrara, Fabio and Falchi, Fabrizio and Caldelli, Roberto and Amato, Giuseppe and Fumarola, Roberta and Becarelli, Rudy},
 title = {Detecting Adversarial Example Attacks to Deep Neural Networks},
 booktitle = {Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing},
 series = {CBMI '17},
 year = {2017},
 isbn = {978-1-4503-5333-5},
 location = {Florence, Italy},
 pages = {38:1--38:7},
 articleno = {38},
 numpages = {7},
 url = {http://doi.acm.org/10.1145/3095713.3095753},
 doi = {10.1145/3095713.3095753},
 acmid = {3095753},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {Adversarial images detection, Deep Convolutional Neural Network, Machine Learning Security},
} 

Deep learning for decentralized parking lot occupancy detection

G. Amato, F. Carrara, F. Falchi, C. Gennaro, C. Meghini, C. Vairo

Expert Systems with Applications (ESWA)

Volume 72, 2017, Pages 327-334 - Online from 29/10/2016, Elsevier Sci LTD, Oxford, UK
ISSN: 0957-4174, eISSN: 1873-6793, WOS: 000392770900028, Scopus: 2-s2.0-85006097682, DOI: 10.1016/j.eswa.2016.10.055

A smart camera is a vision system capable of extracting application-specific information from the captured images. The paper proposes a decentralized and efficient solution for visual parking lot occupancy detection based on a deep Convolutional Neural Network (CNN) specifically designed for smart cameras. This solution is compared with state-of-the-art approaches using two visual datasets: PKLot, already existing in literature, and CNRPark-EXT. The former is an existing dataset, that allowed us to exhaustively compare with previous works. The latter dataset has been created in the context of this research, accumulating data across various seasons of the year, to test our approach in particularly challenging situations, exhibiting occlusions, and diverse and difficult viewpoints. This dataset is public available to the scientific community and is another contribution of our research. Our experiments show that our solution outperforms and generalizes the best performing approaches on both datasets. The performance of our proposed CNN architecture on the parking lot occupancy detection task, is comparable to the well-known AlexNet, which is three orders of magnitude larger.
@article{AMATO2017327,
title = "Deep learning for decentralized parking lot occupancy detection",
journal = "Expert Systems with Applications",
volume = "72",
number = "",
pages = "327 - 334",
year = "2017",
note = "",
issn = "0957-4174",
doi = "http://dx.doi.org/10.1016/j.eswa.2016.10.055",
url = "http://www.sciencedirect.com/science/article/pii/S095741741630598X",
author = "Giuseppe Amato and Fabio Carrara and Fabrizio Falchi and Claudio Gennaro and Carlo Meghini and Claudio Vairo",
keywords = "Machine learning",
keywords = "Classification",
keywords = "Deep learning",
keywords = "Convolutional neural networks",
keywords = "Parking space dataset"
}

YFCC100M-HNFC6: A large-scale deep features benchmark for similarity search

G. Amato, F. Falchi, C. Gennaro, F. Rabitti

Similarity Search and Applications: 9th International Conference, SISAP 2016 . Tokyo, Japan, October 24-26, 2016

Lecture Notes in Computer Science Volume, vol. 9939, Pages 196-209, Springer International Publishing Switzerland (Cham, Switzerland), 2016
ISSN: 0302-9743, ISBN: 978-3-319-46759-7, WOS: 000389801100015, Scopus: 2-s2.0-84989904386, DOI: 10.1007/978-3-319-46759-7_15

In this paper, we present YFCC100M-HNfc6, a benchmark consisting of 97M deep features extracted from the Yahoo Creative Commons 100M (YFCC100M) dataset. Three type of features were extracted using a state-of-the-art Convolutional Neural Network trained on the ImageNet and Places datasets. Together with the features, we made publicly available a set of 1,000 queries and k-NN results obtained by sequential scan. We first report detailed statistical information on both the features and search results. Then, we show an example of performance evaluation, performed using this benchmark, on the MI-File approximate similarity access method.
@Inbook{Amato2016,
 author="Amato, Giuseppe and Falchi, Fabrizio and Gennaro, Claudio
 and Rabitti, Fausto",
 title="YFCC100M-HNfc6: A Large-Scale Deep Features Benchmark for Similarity Search",
 bookTitle="Similarity Search and Applications: 9th International Conference, SISAP 2016, Tokyo, Japan, October 24-26, 2016, Proceedings",
 year="2016",
 publisher="Springer International Publishing",
 address="Cham",
 pages="196--209",
 isbn="978-3-319-46759-7",
 doi="10.1007/978-3-319-46759-7_15",
 url="https://doi.org/10.1007/978-3-319-46759-7_15"
}

YFCC100M HybridNet fc6 Deep Features for Content-Based Image Retrieval

G. Amato, F. Falchi, C. Gennaro, F. Rabitti

Proceedings of the 2016 ACM Workshop on Multimedia COMMONS, MMCommons '16

ACM New York, NY, USA ISBN: 978-1-4503-4515-6, Scopus: 2-s2.0-84995553525 | DOI: 10.1145/2983554.2983557

This paper presents a corpus of deep features extracted from the YFCC100M images considering the fc6 hidden layer activation of the HybridNet deep convolutional neural network. For a set of random selected queries we made available k-NN results obtained sequentially scanning the entire set features comparing both using the Euclidean and Hamming Distance on a binarized version of the features. This set of results is ground truth for evaluating Content-Based Image Retrieval (CBIR) systems that use approximate similarity search methods for efficient and scalable indexing. Moreover, we present experimental results obtained indexing this corpus with two distinct approaches: the Metric Inverted File and the Lucene Quantization. These two CBIR systems are public available online allowing real-time search using both internal and external queries.
@inproceedings{Amato:2016:YHF:2983554.2983557,
 author = {Amato, Giuseppe and Falchi, Fabrizio and Gennaro, Claudio and Rabitti, Fausto},
 title = {YFCC100M HybridNet Fc6 Deep Features for Content-Based Image Retrieval},
 booktitle = {Proceedings of the 2016 ACM Workshop on Multimedia COMMONS},
 series = {MMCommons '16},
 year = {2016},
 isbn = {978-1-4503-4515-6},
 location = {Amsterdam, The Netherlands},
 pages = {11--18},
 numpages = {8},
 url = {http://doi.acm.org/10.1145/2983554.2983557},
 doi = {10.1145/2983554.2983557},
 acmid = {2983557},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {YFCC100M, content-based image retrieval, deep features, multimedia information retrieval},
}


Aggregating binary local descriptors for image retrieval

Amato G., Falchi F., Vadicamo L.

Multimedia Tools and Applications (MTAP) March 2018, Volume 77, Issue 5, pp 5385–5415

Springer US, 2018. ISSN: 1380-7501, eISSN: 573-7721
WOS: 000427132800013 | Scopus: 2-s2.0-85014088854 | DOI: 10.1007/s11042-017-4450-2

Content-Based Image Retrieval based on local features is computationally expensive because of the complexity of both extraction and matching of local feature. On one hand, the cost for extracting, representing, and comparing local visual descriptors has been dramatically reduced by recently proposed binary local features. On the other hand, aggregation techniques provide a meaningful summarization of all the extracted feature of an image into a single descriptor, allowing us to speed up and scale up the image search. Only a few works have recently mixed together these two research directions, defining aggregation methods for binary local features, in order to leverage on the advantage of both approaches.In this paper, we report an extensive comparison among state-of-the-art aggregation methods applied to binary features. Then, we mathematically formalize the application of Fisher Kernels to Bernoulli Mixture Models. Finally, we investigate the combination of the aggregated binary features with the emerging Convolutional Neural Network (CNN) features. Our results show that aggregation methods on binary features are effective and represent a worthwhile alternative to the direct matching. Moreover, the combination of the CNN with the Fisher Vector (FV) built upon binary features allowed us to obtain a relative improvement over the CNN results that is in line with that recently obtained using the combination of the CNN with the FV built upon SIFTs. The advantage of using the FV built upon binary features is that the extraction process of binary features is about two order of magnitude faster than SIFTs.
@Article{Amato2017,
author="Amato, Giuseppe and Falchi, Fabrizio and Vadicamo, Lucia",
title="Aggregating binary local descriptors for image retrieval",
journal="Multimedia Tools and Applications",
year="2017",
month="Mar",
day="02",
issn="1573-7721",
doi="10.1007/s11042-017-4450-2",
url="https://doi.org/10.1007/s11042-017-4450-2"
}

A comparison of pivot selection techniques for permutation-based indexing

G. Amato, A. Esuli, F. Falchi

Information Systems, Volume 52, August–September 2015, Pages 176–188

ISSN: 0306-4379, WOS: 000356983400012 | Scopus: 2-s2.0-84930083486 | DOI: 10.1016/j.is.2015.01.010

Recently, permutation based indexes have attracted interest in the area of similarity search. The basic idea of permutation based indexes is that data objects are represented as appropriately generated permutations of a set of pivots (or reference objects). Similarity queries are executed by searching for data objects whose permutation representation is similar to that of the query, following the assumption that similar objects are represented by similar permutations of the pivots. In the context of permutation-based indexing, most authors propose to select pivots randomly from the data set, given that traditional pivot selection techniques do not reveal better performance. However, to the best of our knowledge, no rigorous comparison has been performed yet. In this paper we compare five pivot selection techniques on three permutation-based similarity access methods. Among those, we propose a novel technique specifically designed for permutations. Two significant observations emerge from our tests. First, random selection is always outperformed by at least one of the tested techniques. Second, there is no technique that is universally the best for all permutation-based access methods; rather different techniques are optimal for different methods. This indicates that the pivot selection technique should be considered as an integrating and relevant part of any permutation-based access method.
@article{AMATO2015176,
 title = "A comparison of pivot selection techniques for permutation-based indexing",
 journal = "Information Systems",
 volume = "52",
 pages = "176 - 188",
 year = "2015",
 note = "Special Issue on Selected papers from SISAP 2013",
 issn = "0306-4379",
 doi = "http://dx.doi.org/10.1016/j.is.2015.01.010",
 author = "Giuseppe Amato and Andrea Esuli and Fabrizio Falchi",
}

Efficient Indexing of Regional Maximum Activations of Convolutions using Full-Text Search Engines

G. Amato, F. Carrara, F. Falchi, C. Gennaro

Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval (ICMR '17)

Bucharest, Romania — June 06 - 09, 2017. ACM New York, NY, USA ©2017, Pages 420-423. ISBN: 978-1-4503-4701-3 , Scopus: 2-s2.0-85021827776, DOI: 10.1145/3078971.3079035

In this paper, we adapt a surrogate text representation technique to develop efficient instance-level image retrieval using Regional Maximum Activations of Convolutions (R-MAC). R-MAC features have recently showed outstanding performance in visual instance retrieval. However, contrary to the activations of hidden layers adopting ReLU (Rectified Linear Unit), these features are dense. This constitutes an obstacle to the direct use of inverted indexes, which rely on sparsity of data. We propose the use of deep permutations, a recent approach for efficient evaluation of permutations, to generate surrogate text representation of R-MAC features, enabling indexing of visual features as text into a standard search-engine. The experiments, conducted on Lucene, show the effectiveness and efficiency of the proposed approach.
@inproceedings{Amato:2017:EIR:3078971.3079035,
author = {Amato, Giuseppe and Carrara, Fabio and Falchi, Fabrizio and Gennaro, Claudio},
title = {Efficient Indexing of Regional Maximum Activations of Convolutions Using Full-Text Search Engines},
booktitle = {Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval},
series = {ICMR '17},
year = {2017},
isbn = {978-1-4503-4701-3},
location = {Bucharest, Romania},
pages = {420--423},
numpages = {4},
url = {http://doi.acm.org/10.1145/3078971.3079035},
doi = {10.1145/3078971.3079035},
acmid = {3079035},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {deep convolutional neural network, permutation-based indexing, similarity search},
}

Technologies for visual localization and augmented reality in smart cities

G. Amato, F.A. Cardillo, F. Falchi

Sensing the Past: From artifact to historical site

Geotechnologies and the Environment, Volume 16, 2017
Springer International Publishing, Cham (Switzerland), 2017, Pages 419-434
ISBN: 978-3-319-50518-3, eISBN: 78-3-319-50518-3, ISSN: 2365-0575
WOS: 000419729100021 | DOI: 10.1007/978-3-319-50518-3_20

The widespread diffusion of smart devices, such as smartphones and tablets, and the new emerging trend of wearable devices, such as smart glasses and smart watches, has pushed forward the development of applications where the user can interact relying on his or her position and field of view. In this way, users can also receive additional information in augmented reality, that is, seeing the information through the smart device, overlaid on top of the real scene. The GPS or the compass can be used to localize the user when augmented reality has to be provided with scenes of large size, for instance, squares or large buildings. However, when augmented reality has to be offered for enriching the view of small objects or small details of larger objects, for instance, statues, paintings, or epigraphs, a more precise positioning is needed. Visual object recognition and tracking technologies offer very detailed and fine-grained positioning capabilities. This chapter discusses the techniques enabling a precise positioning of the user and the subsequent experience in augmented reality, focusing on algorithms for image matching and homography estimation between the images seen by smart devices and images representing objects of interest.

@Inbook{Amato2017,
author="Amato, Giuseppe and Cardillo, Franco Alberto and Falchi, Fabrizio",
editor="Masini, Nicola and Soldovieri, Francesco",
title="Technologies for Visual Localization and Augmented Reality in Smart Cities",
bookTitle="Sensing the Past: From artifact to historical site",
year="2017",
publisher="Springer International Publishing",
address="Cham",
pages="419--434",
isbn="978-3-319-50518-3",
doi="10.1007/978-3-319-50518-3_20",
url="https://doi.org/10.1007/978-3-319-50518-3_20"
}

Visual recognition of ancient inscriptions using convolutional neural network and fisher vector

G. Amato, F. Falchi, L. Vadicamo

Journal on Computing and Cultural Heritage (JOCCH)

Volume 9, Issue 4, December 2016, Article number 21
ACM New York, NY, USA
ISSN: 1556-4673, eISSN: 1556-4711
WOS: 000391726300004 | Scopus: 2-s2.0-85006974335 | DOI: 10.1145/2964911

By bringing together the most prominent European institutions and archives in the field of Classical Latin and Greek epigraphy, the EAGLE project has collected the vast majority of the surviving Greco-Latin inscriptions into a single readily-searchable database. Text-based search engines are typically used to retrieve information about ancient inscriptions (or about other artifacts). These systems require that the users formulate a text query that contains information such as the place where the object was found or where it is currently located. Conversely, visual search systems can be used to provide information to users (like tourists and scholars) in a most intuitive and immediate way, just using an image as query. In this article, we provide a comparison of several approaches for visual recognizing ancient inscriptions. Our experiments, conducted on 17, 155 photos related to 14, 560 inscriptions, show that BoW and VLAD are outperformed by both Fisher Vector (FV) and Convolutional Neural Network (CNN) features. More interestingly, combining FV and CNN features into a single image representation allows achieving very high effectiveness by correctly recognizing the query inscription in more than 90% of the cases. Our results suggest that combinations of FV and CNN can be also exploited to effectively perform visual retrieval of other types of objects related to cultural heritage such as landmarks and monuments.
@article{Amato:2016:VRA:2999570.2964911,
 author = {Amato, Giuseppe and Falchi, Fabrizio and Vadicamo, Lucia},
 title = {Visual Recognition of Ancient Inscriptions Using Convolutional Neural Network and Fisher Vector},
 journal = {J. Comput. Cult. Herit.},
 issue_date = {December 2016},
 volume = {9},
 number = {4},
 month = dec,
 year = {2016},
 issn = {1556-4673},
 pages = {21:1--21:24},
 articleno = {21},
 numpages = {24},
 url = {http://doi.acm.org/10.1145/2964911},
 doi = {10.1145/2964911},
 acmid = {2964911},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {Fisher vector, Latin and Greek inscriptions, convolutional neural network, epigraphy},
} 

Deep Permutations: Deep Convolutional Neural Networks and Permutation-Based Indexing

G. Amato, F. Falchi, C. Gennaro, L. Vadicamo

Similarity Search and Applications: 9th International Conference, SISAP 2016

Location Tokyo, Japan, October 24-26, 2016
Lecture Notes in Computer Science Volume, vol. 9939, Pages 93-106
Springer International Publishing Switzerland (Cham, Switzerland), 2016
ISSN: 0302-9743, ISBN: 978-3-319-46759-7
WOS: 000389801100007 | Scopus: 2-s2.0-84989842652 | DOI: 10.1007/978-3-319-46759-7_7

The activation of the Deep Convolutional Neural Networks hidden layers can be successfully used as features, often referred as Deep Features, in generic visual similarity search tasks. Recently scientists have shown that permutation-based methods offer very good performance in indexing and supporting approximate similarity search on large database of objects. Permutation-based approaches represent metric objects as sequences (permutations) of reference objects, chosen from a predefined set of data. However, associating objects with permutations might have a high cost due to the distance calculation between the data objects and the reference objects. In this work, we propose a new approach to generate permutations at a very low computational cost, when objects to be indexed are Deep Features. We show that the permutations generated using the proposed method are more effective than those obtained using pivot selection criteria specifically developed for permutation-based methods.
@Inbook{Amato2016,
author="Amato, Giuseppe and Falchi, Fabrizio and Gennaro, Claudio
and Vadicamo, Lucia",
title="Deep Permutations: Deep Convolutional Neural Networks and Permutation-Based Indexing",
bookTitle="Similarity Search and Applications: 9th International Conference, SISAP 2016, Tokyo, Japan, October 24-26, 2016, Proceedings",
year="2016",
publisher="Springer International Publishing",
address="Cham",
pages="93--106",
isbn="978-3-319-46759-7",
doi="10.1007/978-3-319-46759-7_7",
url="https://doi.org/10.1007/978-3-319-46759-7_7"
}

Large scale indexing and searching deep convolutional neural network features

G. Amato, F. Debole, F. Falchi, C. Gennaro, F. Rabitti

Big Data Analytics and Knowledge Discovery

18th International Conference, DaWaK 2016
Porto, Portugal, September 6-8, 2016
ISSN: 0302-9743, eISSN: 1611-3349, ISBN: 978-3-319-43945-7
WOS: 000389020800014 | Scopus: 2-s2.0-84981249591 | DOI: 10.1007/978-3-319-43946-4_14

Content-based image retrieval using Deep Learning has become very popular during the last few years. In this work, we propose an approach to index Deep Convolutional Neural Network Features to support efficient retrieval on very large image databases. The idea is to provide a text encoding for these features enabling the use of a text retrieval engine to perform image similarity search. In this way, we built LuQ a robust retrieval system that combines full-text search with content-based image retrieval capabilities. In order to optimize the index occupation and the query response time, we evaluated various tuning parameters to generate the text encoding. To this end, we have developed a web-based prototype to efficiently search through a dataset of 100 million of images.
@Inbook{Amato2016,
author="Amato, Giuseppe and Debole, Franca and Falchi, Fabrizio and Gennaro, Claudio and Rabitti, Fausto",
editor="Madria, Sanjay
and Hara, Takahiro",
title="Large Scale Indexing and Searching Deep Convolutional Neural Network Features",
bookTitle="Big Data Analytics and Knowledge Discovery: 18th International Conference, DaWaK 2016, Porto, Portugal, September 6-8, 2016, Proceedings",
year="2016",
publisher="Springer International Publishing",
address="Cham",
pages="213--224",
isbn="978-3-319-43946-4",
doi="10.1007/978-3-319-43946-4_14",
url="https://doi.org/10.1007/978-3-319-43946-4_14"
}

Car parking occupancy detection using smart camera networks and deep learning

G. Amato, F. Carrara, F. Falchi, C. Gennaro, C. Vairo

2016 IEEE Symposium on Computers and Communication (IEEE ISCC 2016)

27-30 June, 2016, Messina, Italy.
Received the Best Italian paper Award
Institute of Electrical and Electronics Engineers (IEEE) Inc., 2016, pages 1212-1217
ISSN: 15301346, eISBN: 978-1-5090-0679-3, ISBN: 978-1-5090-0680-9
WOS: 000386979000198 | Scopus: 2-s2.0-84985914810 | DOI: 10.1109/ISCC.2016.7543901

This paper presents an approach for real-time car parking occupancy detection that uses a Convolutional Neural Network (CNN) classifier running on-board of a smart camera with limited resources. Experiments show that our technique is very effective and robust to light condition changes, presence of shadows, and partial occlusions. The detection is reliable, even when tests are performed using images captured from a viewpoint different than the viewpoint used for training. In addition, it also demonstrates its robustness when training and tests are executed on different parking lots. We have tested and compared our solution against state of the art techniques, using a reference benchmark for parking occupancy detection. We have also produced and made publicly available an additional dataset that contains images of the parking lot taken from different viewpoints and in different days with different light conditions. The dataset captures occlusion and shadows that might disturb the classification of the parking spaces status.
@INPROCEEDINGS{7543901,
author={G. Amato and F. Carrara and F. Falchi and C. Gennaro and C. Vairo},
booktitle={2016 IEEE Symposium on Computers and Communication (ISCC)},
title={Car parking occupancy detection using smart camera networks and Deep Learning},
year={2016},
pages={1212-1217},
doi={10.1109/ISCC.2016.7543901},
isbn = {978-1-5090-0679-3},
publisher = {{IEEE} Computer Society}
}

Picture it in your mind: Generating high level visual representations from textual descriptions

F. Carrara, A. Esuli, T. Fagni, F. Falchi, A.M. Fernández

Presented at the Neu-IR: The SIGIR 2016 Workshop on Neural Information Retrieval, Pisa, July 21, 2016 arXiv:1606.07287

In this paper we tackle the problem of image search when the query is a short textual description of the image the user is looking for. We choose to implement the actual search process as a similarity search in a visual feature space, by learning to translate a textual query into a visual representation. Searching in the visual feature space has the advantage that any update to the translation model does not require to reprocess the, typically huge, image collection on which the search is performed. We propose Text2Vis, a neural network that generates a visual representation, in the visual feature space of the fc6-fc7 layers of ImageNet, from a short descriptive text. Text2Vis optimizes two loss functions, using a stochastic loss-selection method. A visual-focused loss is aimed at learning the actual text-to-visual feature mapping, while a text-focused loss is aimed at modeling the higher-level semantic concepts expressed in language and countering the overfit on non-relevant visual components of the visual loss. We report preliminary results on the MS-COCO dataset.
@article{DBLP:journals/corr/CarraraEFFF16,
 author = {Fabio Carrara and
 Andrea Esuli and
 Tiziano Fagni and
 Fabrizio Falchi and
 Alejandro Moreo Fern{\'{a}}ndez},
 title = {Picture It In Your Mind: Generating High Level Visual Representations
 From Textual Descriptions},
 journal = {CoRR},
 volume = {abs/1606.07287},
 year = {2016},
 url = {http://arxiv.org/abs/1606.07287},
 timestamp = {Wed, 07 Jun 2017 14:43:03 +0200},
 biburl = {http://dblp.uni-trier.de/rec/bib/journals/corr/CarraraEFFF16},
 bibsource = {dblp computer science bibliography, http://dblp.org}
}

Fast Image Classification for Monument Recognition

G. Amato, F. Falchi, C. Gennaro

Journal on Computing and Cultural Heritage (JOCCH)

Volume 8, Issue 4, December 2015, Article number 18. ACM New York, NY, USA, ISSN: 1556-4673, eISSN: 1556-4711, WOS: 000361070300001 | Scopus: 2-s2.0-84939809955 | DOI: 10.1145/2724727

Content-based image classification is a wide research field that addresses the landmark recognition problem. Among the many classification techniques proposed, the k-nearest neighbor (kNN) is one of the most simple and widely used methods. In this article, we use kNN classification and landmark recognition techniques to address the problem of monument recognition in images. We propose two novel approaches that exploit kNN classification technique in conjunction with local visual descriptors. The first approach is based on a relaxed definition of the local feature based image to image similarity and allows standard kNN classification to be efficiently executed with the support of access methods for similarity search. The second approach uses kNN classification to classify local features rather than images. An image is classified evaluating the consensus among the classification of its local features. In this case, access methods for similarity search can be used to make the classification approach efficient. The proposed strategies were extensively tested and compared against other state-of-the-art alternatives in a monument and cultural heritage landmark recognition setting. The results proved the superiority of our approaches. An additional relevant contribution of this work is the exhaustive comparison of various types of local features and image matching solutions for recognition of monuments and cultural heritage related landmarks.
@article{Amato:2015:FIC:2815168.2724727,
 author = {Amato, Giuseppe and Falchi, Fabrizio and Gennaro, Claudio},
 title = {Fast Image Classification for Monument Recognition},
 journal = {J. Comput. Cult. Herit.},
 issue_date = {August 2015},
 volume = {8},
 number = {4},
 month = aug,
 year = {2015},
 issn = {1556-4673},
 pages = {18:1--18:25},
 articleno = {18},
 numpages = {25},
 url = {http://doi.acm.org/10.1145/2724727},
 doi = {10.1145/2724727},
 acmid = {2724727},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {image classification, kNN classification, local features, object recognition, tourism},
} 

Similarity caching in large-scale image retrieval

F. Falchi, C. Lucchese, S. Orlando, R. Perego, F. Rabitti

Information Processing & Management (IPM)
Volume 48, issue 5, September 2012, pp. 803-818, Elsevier Sci LTD, Oxford, UK

ISSN: 0306-4573, WOS: 000307682100001 | Scopus: 2-s2.0-84864286791 | DOI: 10.1016/j.ipm.2010.12.006

Feature-rich data, such as audio-video recordings, digital images, and results of scientific experiments, nowadays constitute the largest fraction of the massive data sets produced daily in the e-society. Content-based similarity search systems working on such data collections are rapidly growing in importance. Unfortunately, similarity search is in general very expensive and hardly scalable. In this paper we study the case of content-based image retrieval (CBIR) systems, and focus on the problem of increasing the throughput of a large-scale CBIR system that indexes a very large collection of digital images. By analyzing the query log of a real CBIR system available on the Web, we characterize the behavior of users who experience a novel search paradigm, where content-based similarity queries and text-based ones can easily be interleaved. We show that locality and self-similarity is present even in the stream of queries submitted to such a CBIR system. According to these results, we propose an effective way to exploit this locality, by means of a similarity caching system, which stores the results of recently/frequently submitted queries and associated results. Unlike traditional caching, the proposed cache can manage not only exact hits, but also approximate ones that are solved by similarity with respect to the result sets of past queries present in the cache. We evaluate extensively the proposed solution by using the real query stream recorded in the log and a collection of 100 millions of digital photographs. The high hit ratios and small average approximation error figures obtained demonstrate the effectiveness of the approach.
@article{FALCHI2012803,
 author = "Fabrizio Falchi and Claudio Lucchese and Salvatore Orlando and Raffaele Perego and Fausto Rabitti",
 title = "Similarity caching in large-scale image retrieval",
 journal = "Information Processing & Management",
 volume = "48",
 number = "5",
 pages = "803 - 818",
 year = "2012",
 issn = "0306-4573",
 doi = "http://dx.doi.org/10.1016/j.ipm.2010.12.006",
 url = "http://www.sciencedirect.com/science/article/pii/S030645731000107X",
}

Building a web-scale image similarity search system

M. Batko, F. Falchi, C. Lucchese, D. Novak, R. Perego, F. Rabitti, J. Sedmidubsky, P. Zezula

Multimedia Tools and Applications (MTAP)

ISSN: 1380-7501, eISSN: 1573-7721, WOS: 000275800200012 | Scopus: 2-s2.0-77950188067, DOI: 10.1007/s11042-009-0339-z

As the number of digital images is growing fast and Content-based Image Retrieval (CBIR) is gaining in popularity, CBIR systems should leap towards Web-scale datasets. In this paper, we report on our experience in building an experimental similarity search system on a test collection of more than 50 million images. The first big challenge we have been facing was obtaining a collection of images of this scale with the corresponding descriptive features. We have tackled the non-trivial process of image crawling and extraction of several MPEG-7 descriptors. The result of this effort is a test collection, the first of such scale, opened to the research community for experiments and comparisons. The second challenge was to develop indexing and searching mechanisms able to scale to the target size and to answer similarity queries in real-time. We have achieved this goal by creating sophisticated centralized and distributed structures based purely on the metric space model of data. We have joined them together which has resulted in an extremely flexible and scalable solution. In this paper, we study in detail the performance of this technology and its evolvement as the data volume grows by three orders of magnitude. The results of the experiments are very encouraging and promising for future applications.
@Article{Batko2010,
 author="Batko, Michal and Falchi, Fabrizio and Lucchese, Claudio and Novak, David and Perego, Raffaele and Rabitti, Fausto and Sedmidubsky, Jan and Zezula, Pavel",
 title="Building a web-scale image similarity search system",
 journal="Multimedia Tools and Applications",
 year="2010",
 month="May",
 day="01",
 volume="47",
 number="3",
pages="599--629",
 issn="1573-7721",
 doi="10.1007/s11042-009-0339-z",
 url="https://doi.org/10.1007/s11042-009-0339-z"
}

CoPhIR: a Test Collection for Content-Based Image Retrieval

P. Bolettieri, A. Esuli, F. Falchi, C. Lucchese, R. Perego, T. Piccioli, F. Rabitti

CoRR (Computing Research Repository), abs/0905.4627 arXiv, Cornell University Library, 2009, 15 pp.

The scalability, as well as the effectiveness, of the different Content-based Image Retrieval (CBIR) approaches proposed in literature, is today an important research issue. Given the wealth of images on the Web, CBIR systems must in fact leap towards Web-scale datasets. In this paper, we report on our experience in building a test collection of 100 million images, with the corresponding descriptive features, to be used in experimenting new scalable techniques for similarity searching, and comparing their results. In the context of the SAPIR (Search on Audio-visual content using Peer-to-peer Information Retrieval) European project, we had to experiment our distributed similarity searching technology on a realistic data set. Therefore, since no large-scale collection was available for research purposes, we had to tackle the non-trivial process of image crawling and descriptive feature extraction (we used five MPEG-7 features) using the European EGEE computer GRID. The result of this effort is CoPhIR, the first CBIR test collection of such scale. CoPhIR is now open to the research community for experiments and comparisons, and access to the collection was already granted to more than 50 research groups worldwide.
@article{DBLP:journals/corr/abs-0905-4627,
 author = {Paolo Bolettieri and
 Andrea Esuli and
 Fabrizio Falchi and
 Claudio Lucchese and
 Raffaele Perego and
 Tommaso Piccioli and
 Fausto Rabitti},
 title = {CoPhIR: a Test Collection for Content-Based Image Retrieval},
 journal = {CoRR},
 volume = {abs/0905.4627},
 year = {2009},
 url = {http://arxiv.org/abs/0905.4627},
 timestamp = {Wed, 07 Jun 2017 14:40:13 +0200},
 biburl = {http://dblp.uni-trier.de/rec/bib/journals/corr/abs-0905-4627},
 bibsource = {dblp computer science bibliography, http://dblp.org}
}

Enabling Content-Based Image Retrieval in Very Large Digital Libraries

P. Bolettieri, A. Esuli, F. Falchi, C. Lucchese, R. Perego and F. Rabitti

Proceeding of the Second Workshop on Very Large Digital Libraries, 2 October 2009, Corfu, Greece. DELOS: an Association for Digital Libraries (Pisa, Italy), 2009: pp. 43-50

ISBN: 978-888850685-2

Enabling e ective and e cient Content-Based Image Re- trieval (CBIR) on Very Large Digital Libraries (VLDLs), is today an important research issue. While there exist well-known approaches for information retrieval on textual content for VLDLs, the research for an e ective CBIR method that is also able to scale to very large collections is still open. A practical e ect of this situation is that most of the image retrieval services currently available for VLDLs are based only on tex- tual metadata. In this paper, we report on our experience in creating a collection of 106 million images, i.e., the CoPhIR collection, the largest currently available to the scienti c community for research purposes.We discuss the various issues arising from working with a such large col- lection and dealing with a complex retrieval model on information-rich features. We present the non-trivial process of image crawling and de- scriptive feature extraction, using the European EGEE computer GRID. The feature extraction phase is often ignored when discussing the scala- bility issue while, as we show in this work, it could be one of the toughest issues to be solved in order to make CBIR feasible on VLDLs.
@inproceedings{2009,
 author = {Paolo Bolettieri and Andrea Eusli and Fabrizio Falchi and Claudio Lucchese and Raffaele Perego and Fausto Rabitti},
 title = {Enabling Content-Based Image Retrieval in Very Large Digital Libraries},
 booktitle = {Proceeding of the Second Workshop on Very Large Digital Libraries, 2 October 2009, Corfu, Greece},
 pages = {43-50},
 year = {2009},
 publisher = {DELOS: an Association for Digital Libraries (Pisa, Italy)},
 isbn = {978-888850685-2}
}

Distance browsing in distributed multimedia databases

F. Falchi, C. Gennaro, F. Rabitti, P. Zezula

In Future Generation Computer Systems(FGCS) Volume 25, Issue 1 (January 2009), Elsevier Science Publishers B. V. (Amsterdam, The Netherlands), 2009: pp. 64-76. ISSN: 0167-739X

WOS: 000260238300008, Scopus: 2-s2.0-51249094919, DOI: 10.1016/j.future.2008.02.007

The state of the art of searching for non-text data (e.g., images) is to use extracted metadata annotations or text, which might be available as a related information. However, supporting real content-based audiovisual search, based on similarity search on features, is significantly more expensive than searching for text. Moreover, such search exhibits linear scalability with respect to the dataset size, so parallel query execution is needed. In this paper, we present a Distributed Incremental Nearest Neighbor algorithm (DINN) for finding closest objects in an incremental fashion over data distributed among computer nodes, each able to perform its local Incremental Nearest Neighbor (local-INN) algorithm. We prove that our algorithm is optimum with respect to both the number of involved nodes and the number of local-INN invocations. An implementation of our DINN algorithm, on a real P2P system called MCAN, was used for conducting an extensive experimental evaluation on a real-life dataset. The proposed algorithm is being used in two running projects: SAPIR and NeP4B.
@article{FALCHI200964,
 title = "Distance browsing in distributed multimedia databases",
 journal = "Future Generation Computer Systems",
 volume = "25",
 number = "1",
 pages = "64 - 76",
 year = "2009",
 note = "",
 issn = "0167-739X",
 doi = "http://dx.doi.org/10.1016/j.future.2008.02.007",
 url = "http://www.sciencedirect.com/science/article/pii/S0167739X08000186",
 author = "Fabrizio Falchi and Claudio Gennaro and Fausto Rabitti and Pavel Zezula",
}

Scalability comparison of Peer-to-Peer similarity search structures

M. Batko, D. Novak, F. Falchi, P. Zezula

In Future Generation Computer Systems (FGCS) Volume 24, Issue 8 (October 2008), Elsevier Science Publishers B. V. (Amsterdam, The Netherlands), 2008: pp. 834-848. ISSN: 0167-739X

WOS: 000258426100008, Scopus: 2-s2.0-46849117520, DOI: 10.1016/j.future.2007.07.012

Due to the increasing complexity of current digital data, similarity search has become a fundamental computational task in many applications. Unfortunately, its costs are still high and grow linearly on single server structures, which prevents them from efficient application on large data volumes. In this paper, we shortly describe four recent scalable distributed techniques for similarity search and study their performance in executing queries on three different datasets. Though all the methods employ parallelism to speed up query execution, different advantages for different objectives have been identified by experiments. The reported results would be helpful for choosing the best implementations for specific applications. They can also be used for designing new and better indexing structures in the future.
@article{BATKO2008834,
 title = "Scalability comparison of Peer-to-Peer similarity search structures",
 journal = "Future Generation Computer Systems",
 volume = "24",
 number = "8",
 pages = "834 - 848",
 year = "2008",
 note = "",
 issn = "0167-739X",
 doi = "http://dx.doi.org/10.1016/j.future.2007.07.012",
 url = "http://www.sciencedirect.com/science/article/pii/S0167739X0700132X",
 author = "Michal Batko and David Novak and Fabrizio Falchi and Pavel Zezula",
}

Nearest neighbor search in metric spaces through Content-Addressable Networks

F. Falchi, C. Gennaro, P. Zezula

Information Processing and Management (IPM). Volume 44, Issue 1 (2008).
Elsevier Sci LTD, Oxford, UK: pp. 411-429. ISSN: 0306-4573

WOS: 000251441600031, Scopus: 2-s2.0-35549001762, DOI: 10.1016/j.ipm.2007.03.002

Most of the peer-to-peer search techniques proposed in the recent years have focused on the single-key retrieval. However, similarity search in metric spaces represents an important paradigm for content-based retrieval in many applications. In this paper we introduce an extension of the well-known Content-Addressable Network paradigm to support storage and retrieval of more generic metric space objects. In particular we address the problem of executing the nearest neighbors queries, and propose three different algorithms of query propagation. An extensive experimental study on real-life data sets explores the performance characteristics of the proposed algorithms by showing their advantages and disadvantages.
@article{FALCHI2008411,
 title = "Nearest neighbor search in metric spaces through Content-Addressable Networks",
 journal = "Information Processing & Management",
 volume = "44",
 number = "1",
 pages = "411 - 429",
 year = "2008",
 note = "Evaluation of Interactive Information Retrieval Systems",
 issn = "0306-4573",
 doi = "http://dx.doi.org/10.1016/j.ipm.2007.03.002",
 url = "http://www.sciencedirect.com/science/article/pii/S0306457307000763",
 author = "Fabrizio Falchi and Claudio Gennaro and Pavel Zezula"
}

A content–addressable network for similarity search in metric spaces

F. Falchi, C. Gennaro, P. Zezula

In Databases, Information Systems, and Peer-to-Peer Computing, International Workshops, DBISP2P 2005/2006

, Trondheim, Norway, August 28-29, 2005, Seoul, Korea, September 11, 2006, Revised Selected Papers. Lecture Notes in Computer Science, vol. 4125. Springer-Verlag Berlin Heidelberg (Germany), 2007: pp. 98-110. ISBN: 978-3-540-71660-0, ISSN: 0302-9743, WOS: 000246228700009, Scopus: 2-s2.0-38149086079, DOI: 10.1007/978-3-540-71661-7_9

In this paper we present a scalable and distributed access structure for similarity search in metric spaces. The approach is based on the Content-addressable Network (CAN) paradigm, which provides a Distributed Hash Table (DHT) abstraction over a Cartesian space. We have extended the CAN structure to support storage and retrieval of generic metric space objects. We use pivots for projecting objects of the metric space in an N-dimensional vector space, and exploit the CAN organization for distributing the objects among the computing nodes of the structure. We obtain a Peer-to-Peer network, called the MCAN, which is able to search metric space objects by means of the similarity range queries. Experiments conducted on our prototype system confirm full scalability of the approach.
@inproceedings{Falchi:2005:CNS:1783738.1783751,
 author = {Falchi, Fabrizio and Gennaro, Claudio and Zezula, Pavel},
 title = {A Content-addressable Network for Similarity Search in Metric Spaces},
 booktitle = {Proceedings of the 2005/2006 International Conference on Databases, Information Systems, and Peer-to-peer Computing},
 series = {DBISP2P'05/06},
 year = {2007},
 isbn = {978-3-540-71660-0},
 location = {Trondheim, Norway},
 pages = {98--110},
 numpages = {13},
 url = {http://dl.acm.org/citation.cfm?id=1783738.1783751},
 acmid = {1783751},
 publisher = {Springer-Verlag},
 address = {Berlin, Heidelberg},
}

Editorial

Special Section on "Similarity Search and Applications: Selected papers from SISAP 2015"

Edited by G. Amato, R. Connor, F. Falchi and C. Gennaro

Information Systems, Volume 64, March 2017

ISSN: 0306-4379

Similarity Search and Applications

8th International Conference, SISAP 2015, Glasgow, UK, October 12-14, 2015, Proceedings

G. Amato, R. Connor, F. Falchi, C. Gennaro

Lecture Notes in Computer Science, vol. 9371

ISSN: 0302-9743, Print ISBN: 978-3-319-25086-1, Online ISBN: 978-3-319-25087-8
Scopus: 2-s2.0-84951868065, DOI: 10.1007/978-3-319-25087-8

@proceedings{DBLP:conf/sisap/2015,
	 editor = {Giuseppe Amato and
				 Richard C. H. Connor and
				 Fabrizio Falchi and
				 Claudio Gennaro},
	 title = {Similarity Search and Applications - 8th International Conference,
				 {SISAP} 2015, Glasgow, UK, October 12-14, 2015, Proceedings},
	 series = {Lecture Notes in Computer Science},
	 volume = {9371},
	 publisher = {Springer},
	 year = {2015},
	 url = {https://doi.org/10.1007/978-3-319-25087-8},
	 doi = {10.1007/978-3-319-25087-8},
	 isbn = {978-3-319-25086-1}
	}

Special track on Engineering Large-Scale Distributed Systems: editorial message

F. Falchi, C. Lucchese

Proceedings of the 2008 ACM symposium on Applied computing (SAC), , Fortaleza, Ceara, Brazil, March 16-20, 2008, vol. I ACM 2008: editorial message, pp. 453-454.

ISBN: 978-1-59593-753-7, Scopus: 2-s2.0-56749175979, DOI: 10.1145/1363686.1363799

@inproceedings{Falchi:2008:STE:1363686.1363799,
 author = {Falchi, Fabrizio and Lucchese, Claudio},
 title = {Special Track on Engineering Large-Scale Distributed Systems: Editorial Message},
 booktitle = {Proceedings of the 2008 ACM Symposium on Applied Computing},
 series = {SAC '08},
 year = {2008},
 isbn = {978-1-59593-753-7},
 location = {Fortaleza, Ceara, Brazil},
 pages = {453--454},
 numpages = {2},
 url = {http://doi.acm.org/10.1145/1363686.1363799},
 doi = {10.1145/1363686.1363799},
 acmid = {1363799},
 publisher = {ACM},
 address = {New York, NY, USA},
}



Other

 Re-implementing and Extending Relation Network for R-CBIR

N. Messina, G. Amato, F. Falchi
In
IRCDL 2020: Digital Libraries: The Era of Big Data and Data Science.

Relational reasoning is an emerging theme in Machine Learning in general and in Computer Vision in particular. Deep Mind has recently proposed a module called Relation Network (RN) that has shown impressive results on visual question answering tasks. Unfortunately, the implementation of the proposed approach was not public. To reproduce their experiments and extend their approach in the context of Information Retrieval, we had to re-implement everything, testing many parameters and conducting many experiments. Our implementation is now public on GitHub and it is already used by a large community of researchers. Furthermore, we recently presented a variant of the relation network module that we called Aggregated Visual Features RN (AVF-RN). This network can produce and aggregate at inference time compact visual relationship-aware features for the Relational-CBIR (R-CBIR) task. R-CBIR consists in retrieving images with given relationships among objects. In this paper, we discuss the details of our Relation Network implementation and more experimental results than the original paper. Relational reasoning is a very promising topic for better understanding and retrieving inter-object relationships, especially in digital libraries.
@InProceedings{10.1007/978-3-030-39905-4_9,
author="Messina, Nicola
and Amato, Giuseppe
and Falchi, Fabrizio",
editor="Ceci, Michelangelo
and Ferilli, Stefano
and Poggi, Antonella",
title="Re-implementing and Extending Relation Network for R-CBIR",
booktitle="Digital Libraries: The Era of Big Data and Data Science",
year="2020",
publisher="Springer International Publishing",
address="Cham",
pages="82--92",
isbn="978-3-030-39905-4"
}

AIMH Research Activities 2023

Aloia N., Amato G., Bartalesi V., Benedetti F., Bolettieri P., Cafarelli D., Carrara F., Casarosa V., Ciampi L., Coccomini D., Concordia C., Corbara S., Di Benedetto M., Esuli A., Falchi F., Gennaro C., Lagani G., Meghini C., Messina N., Metilli D., Molinari A., Moreo A., Nardi A., Pedrotti A., Pratelli N., Rabitti F., Savino P., Sebastiani F., Sperduti G., Thanos C., Trupiano L., Vadicamo L., Vairo C.
CNR ISTI Annual Report, No. 475648, December 2023.
DOI: 10.32079/isti-ar-2022/002.

@techreport{oai:it.cnr:prodotti:490372,
	title = {AIMH Research Activities 2023},
	number = {490372},
	author = {Aloia N. and Amato G. and Bartalesi V. and Bianchi L. and Bolettieri P. and Bosio C. and Carraglia M. and Carrara F. and Casarosa V. and Ciampi L. and Coccomini D. A. and Concordia C. and Corbara S. and De Martino C. and Di Benedetto M. and Esuli A. and Falchi F. and Fazzari E. and Gennaro C. and Lagani G. and Lenzi E. and Meghini C. and Messina N. and Molinari A. and Moreo A. and Nardi A. and Pedrotti A. and Pratelli N. and Puccetti G. and Rabitti F. and Savino P. and Sebastiani F. and Sperduti G. and Thanos C. and Trupiano L. and Vadicamo L. and Vairo C. and Versienti L.},
	doi = {10.32079/isti-ar-2023/001},
	institution = {ISTI Annual Reports, 2023},
	year = {2023}
}
}

AIMH Research Activities 2022

Aloia N., Amato G., Bartalesi V., Benedetti F., Bolettieri P., Cafarelli D., Carrara F., Casarosa V., Ciampi L., Coccomini D., Concordia C., Corbara S., Di Benedetto M., Esuli A., Falchi F., Gennaro C., Lagani G., Meghini C., Messina N., Metilli D., Molinari A., Moreo A., Nardi A., Pedrotti A., Pratelli N., Rabitti F., Savino P., Sebastiani F., Sperduti G., Thanos C., Trupiano L., Vadicamo L., Vairo C.
CNR ISTI Annual Report, No. 475648, December 2022.
DOI: 10.32079/isti-ar-2022/002.

@techreport{oai:it.cnr:prodotti:475648,
	title = {AIMH research activities 2022},
	number = {475648},
	author = {Aloia N. and Amato G. and Bartalesi V. and Benedetti F. and Paolo Bolettieri and Cafarelli D. and Carrara F. and Casarosa V. and Ciampi L. and Coccomini D. A. and Concordia C. and Corbara S. and Di Benedetto M. and Esuli A. and Falchi F. and Gennaro C. and Lagani G. and Lenzi E. and Meghini C. and Messina N. and Metilli D. and Molinari A. and Moreo A. and Nardi A. and Pedrotti A. and Pratelli N. and Rabitti F. and Savino P. and Sebastiani F. and Sperduti G. and Thanos C. and Trupiano L. and Vadicamo L. and Vairo C.},
	doi = {10.32079/isti-ar-2022/002},
	institution = {ISTI Annual reports, 2022},
	year = {2022}
}

AIMH Research Activities 2021

Aloia N., Amato G., Bartalesi V., Benedetti F., Bolettieri P., Cafarelli D., Carrara F., Casarosa V., Ciampi L., Coccomini D., Concordia C., Corbara S., Di Benedetto M., Esuli A., Falchi F., Gennaro C., Lagani G., Massoli F. V., Meghini C., Messina N., Metilli D., Molinari A., Moreo A., Nardi A., Pedrotti A., Pratelli N., Rabitti F., Savino P., Sebastiani F., Sperduti G., Thanos C., Trupiano L., Vadicamo L., Vairo C.
CNR ISTI Annual Report, No. 461931, December 2020.
DOI: 10.32079/isti-ar-2021/003.

@techreport{oai:it.cnr:prodotti:461931,
 title = {AIMH Research Activities 2021},
 author = {Aloia, Nicola and Amato, Giuseppe and Bartalesi Lenzi, Valentina and Benedetti, Filippo and Bolettieri, Paolo and Cafarelli, Donato and Carrara, Fabio and Casarosa, Vittore and Ciampi, Luca and Coccomini, Davide Alessandro and Concordia, Cesare and Corbara, Silvia and Di Benedetto, Marco and Esuli, Andrea and Falchi, Fabrizio and Gennaro, Claudio and Lagani, Gabriele and Massoli, Fabio Valerio and Meghini, Carlo and Messina, Nicola and Metilli, Daniele and Molinari, Alessio and Moreo Fernandez, Alejandro David and Nardi, Alessandro and Pedrotti, Andrea and Pratelli, Nicolò and Rabitti, Fausto and Savino, Pasquale and Sebastiani, Fabrizio and Sperduti, Gianluca and Thanos, Costantino and Trupiano, Luca and Vadicamo, Lucia and Vairo, Claudio Francesco},
 number = {461931},
 group = {AIMH},
 year = {2021},
 institution = {Consiglio Nazionale delle Ricerche},
 doi = {10.32079/isti-ar-2021/003}
}

AIMH Research Activities 2020

Aloia N., Amato G., Bartalesi V., Benedetti F., Bolettieri P., Carrara F., Casarosa V., Ciampi L., Concordia C., Corbara S., Esuli A., Falchi F., Gennaro C., Lagani G., Massoli F. V., Meghini C., Messina N., Metilli D., Molinari A., Moreo A., Nardi A., Pedrotti A., Pratelli N., Rabitti F., Savino P., Sebastiani F., Thanos C., Trupiano L., Vadicamo L., Vairo C
CNR ISTI Annual Report, No. 440133, December 2020.
DOI: 10.32079/isti-ar-2020/001.

@techreport{oai:it.cnr:prodotti:440133,
 title = {AIMH Research Activities 2020},
 author = {Aloia, Nicola and Amato Giuseppe and Bartalesi Valentina and Benedetti Filippo and Bolettieri Paolo and Carrara Fabio and Casarosa Vittore and Ciampi Luca and Concordia Cesare and Corbara Silvia and Esuli Andrea and Falchi Fabrizio and Gennaro Claudio and Lagani Gabriele and Massoli Fabio Valerio and Meghini Carlo and Messina Nicola and Metilli Daniele and Molinari Alessio and Moreo Alejandro and Nardi Alessandro and Pedrotti Aandrea and Pratelli Nicolò and Rabitti Fauto and Savino Pasquale and Sebastiani Fabrizio and Thanos Costantino and Trupiano Luca and Vadicamo Lucia and Vairo Claudio},
 number = {440133},
 group = {AIMH},
 year = {2020},
 institution = {Consiglio Nazionale delle Ricerche},
 doi = {10.32079/isti-ar-2020/001}
}

AIMIR Research Activities 2019

G. Amato, P. Bolettieri, F.Carrara, L. Ciampi, M. Di Benedetto, F. Debole, F. Falchi, C. Gennaro, G. Lagani, F.V. Massoli, N. Messina, F. Rabitti, P. Savino, L. Vadicamo, C. Vairo
CNR ISTI Annual Report, No. 413891, December 2019.

 Detecting Adversarial Inputs by Looking in the black box

F. Carrara, F. Falchi, G. Amato, R. Becarelli, R. Caldelli
ERCIM News 116 - Special theme: Transparency in Algorithmic Decision Making.

The astonishing and cryptic effectiveness of Deep Neural Networks comes with the critical vulnerability to adversarial inputs — samples maliciously crafted to confuse and hinder machine learning models. Insights into the internal representations learned by deep models can help to explain their decisions and estimate their confidence, which can enable us to trace, characterise, and filter out adversarial attacks.
@Article{ERCIMNews2019Carrara,
author="Carrara, Fabio and Falchi, Fabrizio and Amato, Giuseppe, and Becarelli, Rudy and Caldelli, Roberto",
title="Detecting Adversarial Inputs by Looking in the black box",
journal="ERCIM News",
year="2019",
month="January",
issn="0926-4981",
url="https://ercim-news.ercim.eu/images/stories/EN116/EN116-web.pdf"
}

A Comparison of Face Verification with Facial Landmarks and Deep Features

G. Amato, F. Falchi, C. Gennaro, C. Vairo
MMEDIA 2018 - The Tenth International Conference on Advances in Multimedia.

Face verification is a key task in many application fields, such as security and surveillance. Several approaches and methodologies are currently used to try to determine if two faces belong to the same person. Among these, facial landmarks are very important in forensics, since the distance between some characteristic points of a face can be used as an objective measure in court during trials. However, the accuracy of the approaches based on facial landmarks in verifying whether a face belongs to a given person or not is often not quite good. Recently, deep learning approaches have been proposed to address the face verification problem, with very good results. In this paper, we compare the accuracy of facial landmarks and deep learning approaches in performing the face verification task. Our experiments, conducted on a real case scenario, show that the deep learning approach greatly outperforms in accuracy the facial landmarks approach.
@InProceedings{2018-MMEDIA-Vairo,
author="Amato, Giuseppe and Falchi, Fabrizio and Gennaro, Claudio and Vairo, Claudio",
title="A Comparison of Face Verification with Facial Landmarks and Deep Features",
booktitle="MMEDIA 2018, The Tenth International Conference on Advances in Multimedia",
year="2018",
publisher="IARIA",
pages="1--6",
abstract="Face verification is a key task in many application fields, such as security and surveillance. Several approaches and methodologies are currently used to try to determine if two faces belong to the same person. Among these, facial landmarks are very important in forensics, since the distance between some characteristic points of a face can be used as an objective measure in court during trials. However, the accuracy of the approaches based on facial landmarks in verifying whether a face belongs to a given person or not is often not quite good. Recently, deep learning approaches have been proposed to address the face verification problem, with very good results. In this paper, we compare the accuracy of facial landmarks and deep learning approaches in performing the face verification task. Our experiments, conducted on a real case scenario, show that the deep learning approach greatly outperforms in accuracy the facial landmarks approach.",
isbn="978-1-61208-627-9",
issn="2308-4448",
}

Deep Learning Techniques for Visual Food Recognition on a Mobile App

M. De Bonis, G. Amato, F. Falchi, C. Gennaro, P. Manghi
MISSI 2018, 11th International Conference on Multimedia & Network Information Systems. Scopus: 2-s2.0-85053849854, DOI: 10.1007/978-3-319-98678-4_31. Nominated for the Best paper award

The paper provides an efficient solution to implement a mobile application for food recognition using Convolutional Neural Networks (CNNs). Different CNNs architectures have been trained and tested on two datasets available in literature and the best one in terms of accuracy has been chosen. Since our CNN runs on a mobile phone, efficiency measurements have also taken into account both in terms of memory and computational requirements. The mobile application has been implemented relying on RenderScript and the weights of every layer have been serialized in different files stored in the mobile phone memory. Extensive experiments have been carried out to choose the optimal configuration and tuning parameters.
@InProceedings{10.1007/978-3-319-98678-4_31,
author="De Bonis, Michele
and Amato, Giuseppe
and Falchi, Fabrizio
and Gennaro, Claudio
and Manghi, Paolo",
editor="Choro{\'{s}}, Kazimierz
and Kopel, Marek
and Kukla, El{\.{z}}bieta
and Siemi{\'{n}}ski, Andrzej",
title="Deep Learning Techniques for Visual Food Recognition on a Mobile App",
booktitle="Multimedia and Network Information Systems",
year="2019",
publisher="Springer International Publishing",
address="Cham",
pages="303--312",
abstract="The paper provides an efficient solution to implement a mobile application for food recognition using Convolutional Neural Networks (CNNs). Different CNNs architectures have been trained and tested on two datasets available in literature and the best one in terms of accuracy has been chosen. Since our CNN runs on a mobile phone, efficiency measurements have also taken into account both in terms of memory and computational requirements. The mobile application has been implemented relying on RenderScript and the weights of every layer have been serialized in different files stored in the mobile phone memory. Extensive experiments have been carried out to choose the optimal configuration and tuning parameters.",
isbn="978-3-319-98678-4"
}

Re-ranking Permutation-Based Candidate Sets with the n-Simplex Projection

Amato G., Chávez E., Connor R., Falchi F., Gennaro C.
Similarity Search and Applications, 11th International Conference, SISAP 2018, Lima, Peru, October 7–9, 2018, Proceedings, Scopus: 2-s2.0-85055133979, DOI: 10.1007/978-3-030-02224-2_1
Best paper award

In the realm of metric search, the permutation-based approaches have shown very good performance in indexing and supporting approximate search on large databases. These methods embed the metric objects into a permutation space where candidate results to a given query can be efficiently identified. Typically, to achieve high effectiveness, the permutation-based result set is refined by directly comparing each candidate object to the query one. Therefore, one drawback of these approaches is that the original dataset needs to be stored and then accessed during the refining step. We propose a refining approach based on a metric embedding, called n-Simplex projection, that can be used on metric spaces meeting the n-point property. The n-Simplex projection provides upper- and lower-bounds of the actual distance, derived using the distances between the data objects and fa finite set of pivots. We propose to reuse the distances computed for building the data permutations to derive these bounds and we show how to use them to improve the permutation-based results. Our approach is particularly advantageous for all the cases in which the traditional refining step is too costly, e.g. very large dataset or very expensive metric function.
@InProceedings{10.1007/978-3-030-02224-2_1,
author="Amato, Giuseppe
and Ch{\'a}vez, Edgar
and Connor, Richard
and Falchi, Fabrizio
and Gennaro, Claudio
and Vadicamo, Lucia",
editor="Marchand-Maillet, St{\'e}phane
and Silva, Yasin N.
and Ch{\'a}vez, Edgar",
title="Re-ranking Permutation-Based Candidate Sets with the n-Simplex Projection",
booktitle="Similarity Search and Applications",
year="2018",
publisher="Springer International Publishing",
address="Cham",
pages="3--17",
abstract="In the realm of metric search, the permutation-based approaches have shown very good performance in indexing and supporting approximate search on large databases. These methods embed the metric objects into a permutation space where candidate results to a given query can be efficiently identified. Typically, to achieve high effectiveness, the permutation-based result set is refined by directly comparing each candidate object to the query one. Therefore, one drawback of these approaches is that the original dataset needs to be stored and then accessed during the refining step. We propose a refining approach based on a metric embedding, called n-Simplex projection, that can be used on metric spaces meeting the n-point property. The n-Simplex projection provides upper- and lower-bounds of the actual distance, derived using the distances between the data objects and a finite set of pivots. We propose to reuse the distances computed for building the data permutations to derive these bounds and we show how to use them to improve the permutation-based results. Our approach is particularly advantageous for all the cases in which the traditional refining step is too costly, e.g. very large dataset or very expensive metric function.",
isbn="978-3-030-02224-2"
}

Facial-based Intrusion Detection System with Deep Learning in Embedded Devices

G. Amato, F. Carrara, F. Falchi, C. Gennaro, C. Vairo
In SSIP 2018 Proceedings of the 2018 International Conference on Sensors, Signal and Image Processing. Prague, Czech Republic — October 12 - 14, 2018. ACM New York, NY, USA 2018. p. 64-68. Scops: 2-s2.0-85061105273, DOI: 10.1145/3290589.3290598

With the advent of deep learning based methods, facial recognition algorithms have become more effective and efficient. However, these algorithms have usually the disadvantage of requiring the use of dedicated hardware devices, such as graphical processing units (GPUs), which pose restrictions on their usage on embedded devices with limited computational power. In this paper, we present an approach that allows building an intrusion detection system, based on face recognition, running on embedded devices. It relies on deep learning techniques and does not exploit the GPUs. Face recognition is performed using a k-nn classifier on features extracted from a 50-layers Residual Network (ResNet-50) trained on the VGGFace2 dataset. In our experiment, we determined the optimal confidence threshold that allows distinguishing legitimate users from intruders. In order to validate the proposed system, we created a ground truth composed of 15,393 images of faces and 44 identities, captured by two smart cameras placed in two different offices, in a test period of six months. We show that the obtained results are good both from the efficiency and effectiveness
@inproceedings{Amato:2018:FID:3290589.3290598,
 author = {Amato, Giuseppe and Carrara, Fabio and Falchi, Fabrizio and Gennaro, Claudio and Vairo, Claudio},
 title = {Facial-based Intrusion Detection System with Deep Learning in Embedded Devices},
 booktitle = {Proceedings of the 2018 International Conference on Sensors, Signal and Image Processing},
 series = {SSIP 2018},
 year = {2018},
 isbn = {978-1-4503-6620-5},
 location = {Prague, Czech Republic},
 pages = {64--68},
 numpages = {5},
 url = {http://doi.acm.org/10.1145/3290589.3290598},
 doi = {10.1145/3290589.3290598},
 acmid = {3290598},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {Convolutional Neural Network, Deep learning, Embedded devices, Facial recognition, Intrusion detection},
}
 

Counting vehicles with cameras

L. Ciampi, G. Amato, F. Falchi, C. Gennaro, F. Rabitti
SEBD 2018 - Proceedings of the 26th Italian Symposium on Advanced Database Systems Castellaneta Marina (Taranto), Italy, June 24-27, 2018. ISSN: 1613-0073. Scopus: 2-s2.0-85051861530

This paper aims to develop a method that can accurately count vehicles from images of parking areas captured by smart cameras. To this end, we have proposed a deep learning-based approach for car detection that permits the input images to be of arbitrary perspectives, illumination, and occlusions. No other information about the scenes is needed, such as the position of the parking lots or the perspective maps. This solution is tested using Counting CNRPark-EXT, a new dataset created for this specific task and that is another contribution to our research. Our experiments show that our solution outperforms the stateof-the-art approaches.
@inproceedings{ciampi2018counting,
 title={Counting Vehicles with Cameras.},
 author={Ciampi, Luca and Amato, Giuseppe and Falchi, Fabrizio and Gennaro, Claudio and Rabitti, Fausto},
 booktitle={SEBD},
 year={2018}
}

Towards Multimodal Surveillance For Smart Building Security

G. Amato, P. Barsocchi, F. Falchi, E. Ferro, C. Gennaro, G.R. Leone, D. Moroni, O. Salvetti, C. Vairo
In Proceedings of IWCIM: International Workshop on Computational Intelligence for Multimedia Understanding (IWCIM) at EUSIPCO 2017, Kos Island, Greece, 2 September 2017.
DOI: 10.3390/proceedings2020095

The main goal of a surveillance system is to collect information in a sensing environment and notify unexpected behavior. Information provided by single sensor and surveillance technology may not be sufficient to understand the whole context of the monitored environment. On the other hand, by combining information coming from different sources, the overall performance of a surveillance system can be improved. In this paper, we present the Smart Building Suite, in which independent and different technologies are developed in order to realize a multimodal surveillance system.
@Article{proceedings2020095,
AUTHOR = {Amato, Giuseppe and Barsocchi, Paolo and Falchi, Fabrizio and Ferro, Erina and Gennaro, Claudio and Leone, Giuseppe Riccardo and Moroni, Davide and Salvetti, Ovidio and Vairo, Claudio},
TITLE = {Towards Multimodal Surveillance for Smart Building Security},
JOURNAL = {Proceedings},
VOLUME = {2},
YEAR = {2018},
NUMBER = {2},
ARTICLE NUMBER = {95},
URL = {http://www.mdpi.com/2504-3900/2/2/95},
ISSN = {2504-3900},
DOI = {10.3390/proceedings2020095}
}
 

How Data Mining and Machine Learning Evolved from Relational Data Base to Data Science

G. Amato, L. Candela, D. Castelli, A. Esuli, F. Falchi, C. Gennaro, F. Giannotti, A. Monreale, M. Nanni, P. Pagano, L. Pappalardo, D. Pedreschi, F. Pratesi, F. Rabitti, S. Rinzivillo, G. Rossetti, S. Ruggieri, F. Sebastiani, M. Tesconi
A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years. Springer International Publishing, pages 287-306
DOI: 10.1007/978-3-319-61893-7_17

During the last 35 years, data management principles such as physical and logical independence, declarative querying and cost-based optimization have led to profound pervasiveness of relational databases in any kind of organization. More importantly, these technical advances have enabled the first round of business intelligence applications and laid the foundation for managing and analyzing Big Data today.
@Inbook{Amato2018,
 author="Amato, G. and Candela, L. and Castelli, D. and Esuli, A. and Falchi, F. and Gennaro, C. and Giannotti, F. and Monreale, A. and Nanni, M. and Pagano, P. and Pappalardo, L. and Pedreschi, D. and Pratesi, F. and Rabitti, F. and Rinzivillo, S. and Rossetti, G. and Ruggieri, S. and Sebastiani, F. and Tesconi, M.", editor="Flesca, Sergio and Greco, Sergio and Masciari, Elio and Sacc{\`a}, Domenico",
 title="How Data Mining and Machine Learning Evolved from Relational Data Base to Data Science",
 bookTitle="A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years",
 year="2018",
 publisher="Springer International Publishing",
 address="Cham",
 pages="287--306",
 isbn="978-3-319-61893-7",
 doi="10.1007/978-3-319-61893-7_17",
 url="https://doi.org/10.1007/978-3-319-61893-7_17"
}

Searching and annotating 100M Images with YFCC100M-HNfc6 and MI-File

Amato G., Falchi F., Gennaro C., Rabitti F.
CBMI '17 Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing, Article No. 26 Florence, Italy — June 19 - 21, 2017. ACM New York, NY, USA ©2017. ISBN: 978-1-4503-5333-5, WOS: 000426964400026, Scopus: 2-s2.0-85030765625, DOI: 10.1145/3095713.3095740

We present an image search engine that allows searching by similarity about 100M images included in the YFCC100M dataset, and annotate query images. Image similarity search is performed using YFCC100M-HNfc6, the set of deep features we extracted from the YFCC100M dataset, which was indexed using the MI-File index for efficient similarity searching. A metadata cleaning algorithm, that uses visual and textual analysis, was used to select from the YFCC100M dataset a relevant subset of images and associated annotations, to create a training set to perform automatic textual annotation of submitted queries. The on-line image and annotation system demonstrates the effectiveness of the deep features for assessing conceptual similarity among images, the effectiveness of the metadata cleaning algorithm, to identify a relevant training set for annotation, and the efficiency and accuracy of the MI-File similarity index techniques, to search and annotate using a dataset of 100M images, with very limited computing resources.
@inproceedings{Amato:2017:SAI:3095713.3095740,
 author = {Amato, Giuseppe and Falchi, Fabrizio and Gennaro, Claudio and Rabitti, Fausto},
 title = {Searching and Annotating 100M Images with YFCC100M-HNfc6 and MI-File},
 booktitle = {Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing},
 series = {CBMI '17},
 year = {2017},
 isbn = {978-1-4503-5333-5},
 location = {Florence, Italy},
 pages = {26:1--26:4},
 articleno = {26},
 numpages = {4},
 url = {http://doi.acm.org/10.1145/3095713.3095740},
 doi = {10.1145/3095713.3095740},
 acmid = {3095740},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {Deep Learning, Image Annotation, Image Search},
}

Preface

G. Amato, R. Connor, F. Falchi, C. Gennaro
In Information Systems (IS), Volume 64, March 2017, Page 151, ISSN: 0306-4379, Elsevier Sci LTD, Oxford, UK. WOS: 000391900000011, Scopus: 2-s2.0-85005801899, DOI: 10.1016/j.is.2016.10.008

@article{AMATO2017151,
 title = "Preface",
 journal = "Information Systems",
 volume = "64",
 pages = "151",
 year = "2017",
 issn = "0306-4379",
 doi = "http://dx.doi.org/10.1016/j.is.2016.10.008",
 author = "Giuseppe Amato and Richard Connor and Fabrizio Falchi and Claudio Gennaro",
}

How Effective Are Aggregation Methods on Binary Features?

G. Amato, F. Falchi, L. Vadicamo
Proceedings of VISAPP 2016 - 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, Roma, Italy, February 27-29, 2016, Vol. 4, SciTePress, 2016, Pages 566-573, ISBN: 978-989-758-175-5, DOI: 10.5220/0005719905660573

Surrogate Text Representation (STR) is a profitable solution to efficient similarity search on metric space using conventional text search engines, such as Apache Lucene. This technique is based on comparing the permutations of some reference objects in place of the original metric distance. However, the Achilles heel of STR approach is the need to reorder the result set of the search according to the metric distance. This forces to use a support database to store the original objects, which requires efficient random I/O on a fast secondary memory (such as flash-based storages). In this paper, we propose to extend the Surrogate Text Representation to specifically address a class of visual metric objects known as Vector of Locally Aggregated Descriptors (VLAD). This approach is based on representing the individual sub-vectors forming the VLAD vector with the STR, providing a finer representation of the vector and enabling us to get rid of the reordering phase. The experiments on a publicly available dataset show that the extended STR outperforms the baseline STR achieving satisfactory performance near to the one obtained with the original VLAD vectors.
@inproceedings{visapp16-bf,
 author={Giuseppe Amato and Fabrizio Falchi and Lucia Vadicamo},
 title={How Effective Are Aggregation Methods on Binary Features?},
 booktitle={Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2016)},
 year={2016},
 pages={566-573},
 doi={10.5220/0005719905660573},
 isbn={978-989-758-175-5},
}

Using Apache Lucene to Search Vector of Locally Aggregated Descriptors

G. Amato, Paolo Bolettieri, Fabrizio Falchi, C. Gennaro, L. Vadicamo
Proceedings of VISAPP 2016 - 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, Roma, Italy, February 27-29, 2016, Vol. 4, SciTePress, 2016, Pages 566-573, ISBN: 978-989-758-175-5, DOI: 10.5220/0005722503830392

@inproceedings{visapp16-lucene,
author={Giuseppe Amato and Paolo Bolettieri and Fabrizio Falchi and Claudio Gennaro and Lucia Vadicamo},
title={Using Apache Lucene to Search Vector of Locally Aggregated Descriptors},
booktitle={Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2016)},
year={2016},
pages={383-392},
doi={10.5220/0005722503830392},
isbn={978-989-758-175-5},
}

Combining Fisher Vector and Convolutional Neural Networks for Image Retrieval

G. Amato, F. Falchi, F. Rabitti, L. Vadicamo
7th Italian Information Retrieval Workshop, IIR 2016; Venezia; Italy; 30 May 2016 through 31 May 2016; Code 123011
CEUR Workshop Proceedings, Volume 1653, 2016, Scopus: 2-s2.0-84985906026, ISSN: 16130073

Fisher Vector (FV) and deep Convolutional Neural Network (CNN) are two popular approaches for extracting effective image representations. FV aggregates local information (e.g., SIFT) and have been state-of-the-art before the recent success of deep learning approaches. Recently, combination of FV and CNN has been investigated. However, only the aggregation of SIFT has been tested. In this work, we propose combining CNN and FV built upon binary local features, called BMM-FV. The results show that BMM-FV and CNN improve the latter retrieval performance with less computational effort with respect to the use of the traditional FV which relies on non-binary features.
@inproceedings{DBLP:conf/iir/AmatoFRV16,
 author = {Giuseppe Amato and
 Fabrizio Falchi and
 Fausto Rabitti and
 Lucia Vadicamo},
 title = {Combining Fisher Vector and Convolutional Neural Networks for Image
 Retrieval},
 booktitle = {Proceedings of the 7th Italian Information Retrieval Workshop, Venezia,
 Italy, May 30-31, 2016.},
 year = {2016},
 series = {{CEUR} Workshop Proceedings},
 volume = {1653},
 publisher = {CEUR-WS.org},
 year = {2016}
}

Indexing 100M Images with Deep Features and MI-File

G. Amato, F. Falchi, C. Gennaro, F. Rabitti
7th Italian Information Retrieval Workshop, IIR 2016; Venezia; Italy; 30 May 2016 through 31 May 2016; Code 123011
CEUR Workshop Proceedings, Volume 1653, 2016, Scopus: 2-s2.0-84985993649

@inproceedings{DBLP:conf/iir/AmatoFGR16,
 author = {Giuseppe Amato and
 Fabrizio Falchi and
 Claudio Gennaro and
 Fausto Rabitti},
 title = {Indexing 100M Images with Deep Features and MI-File},
 booktitle = {Proceedings of the 7th Italian Information Retrieval Workshop, Venezia,
 Italy, May 30-31, 2016.},
 year = {2016},
 series = {{CEUR} Workshop Proceedings},
 volume = {1653},
 publisher = {CEUR-WS.org},
 year = {2016}
}

Semiautomatic Learning of 3D Objects from Video Streams

F. Carrara, F. Falchi, C. Gennaro
Similarity Search and Applications, 8th International Conference (SISAP 2015), Glasgow, UK, October 12–14, 2015 Proceedings
Lecture Notes in Computer Science, vol. 9371, pages 217-228 - Springer International Publishing AG Switzerland
ISSN: 0302-9743, Print ISBN: 978-3-319-25086-1, Online ISBN: 978-3-319-25087-8, WOS: 000374289600020, Scopus: 2-s2.0-84951862185, DOI: 10.1007/978-3-319-25087-8_20

@Inbook{Carrara2015,
author="Carrara, Fabio
and Falchi, Fabrizio and Gennaro, Claudio",
editor="Amato, Giuseppe and Connor, Richard and Falchi, Fabrizio
and Gennaro, Claudio",
title="Semiautomatic Learning of 3D Objects from Video Streams",
bookTitle="Similarity Search and Applications: 8th International Conference, SISAP 2015, Glasgow, UK, October 12-14, 2015, Proceedings",
year="2015",
publisher="Springer International Publishing",
address="Cham",
pages="217--228",
isbn="978-3-319-25087-8",
doi="10.1007/978-3-319-25087-8_20",
url="https://doi.org/10.1007/978-3-319-25087-8_20"
}

Searching the EAGLE Epigraphic Material Through Image Recognition via a Mobile Device

P. Bolettieri, V. Casarosa, F. Falchi, L. Vadicamo, P. Martineau, S. Orlandi, R. Santucci
Similarity Search and Applications, 8th International Conference (SISAP 2015), Glasgow, UK, October 12–14, 2015 Proceedings
Lecture Notes in Computer Science, vol. 9371, pages 351-354 - Springer International Publishing AG Switzerland
ISSN: 0302-9743, Print ISBN: 978-3-319-25086-1, WOS: 000374289600035, Scopus: 2-s2.0-84951729513, DOI: 10.1007/978-3-319-25087-8_35

@Inbook{Bolettieri2015,
author="Bolettieri, Paolo
and Casarosa, Vittore
and Falchi, Fabrizio
and Vadicamo, Lucia
and Martineau, Philippe
and Orlandi, Silvia
and Santucci, Raffaella",
editor="Amato, Giuseppe
and Connor, Richard
and Falchi, Fabrizio
and Gennaro, Claudio",
title="Searching the EAGLE Epigraphic Material Through Image Recognition via a Mobile Device",
bookTitle="Similarity Search and Applications: 8th International Conference, SISAP 2015, Glasgow, UK, October 12-14, 2015, Proceedings",
year="2015",
publisher="Springer International Publishing",
address="Cham",
pages="351--354",
isbn="978-3-319-25087-8",
doi="10.1007/978-3-319-25087-8_35",
url="https://doi.org/10.1007/978-3-319-25087-8_35"
}

Preface

G. Amato, R. Connor, F. Falchi, C. Gennaro
Similarity Search and Applications, 8th International Conference (SISAP 2015), Glasgow, UK, October 12–14, 2015 Proceedings
Lecture Notes in Computer Science, vol. 9371, pages V-VI - Springer International Publishing AG Switzerland
ISSN: 0302-9743, Print ISBN: 978-3-319-25086-1, Online ISBN: 978-3-319-25087-8, Scopus: 2-s2.0-84951790748,

@ARTICLE{2015-SISAP-Preface,
author={Amato, G. and Connor, R. and Falchi, F. and Gennaro, C.},
title={Preface},
journal={Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)},
year={2015},
volume={9371},
pages={V-VI},
note={cited By 0},
url={https://www.scopus.com/inward/record.uri?eid=2-s2.0-84951790748&partnerID=40&md5=fe36f8613207dcabe6cbdf952e57e007},
document_type={Editorial},
source={Scopus},
}

Efficient foreground-background segmentation using local features for object detection

F. Carrara, G. Amato, F. Falchi, C. Gennaro
9th International Conference on Distributed Smart Camera (ICDSC 2015), September 08 - 11, 2015, Seville, Spain - ACM New York, USA: pp. 175-180
ISBN: 978-1-4503-3681-9, Scopus: 2-s2.0-84958251956, DOI: 10.1145/2789116.2789136

@inproceedings{Carrara:2015:EFS:2789116.2789136,
 author = {Carrara, Fabio and Amato, Giuseppe and Falchi, Fabrizio and Gennaro, Claudio},
 title = {Efficient Foreground-background Segmentation Using Local Features for Object Detection},
 booktitle = {Proceedings of the 9th International Conference on Distributed Smart Cameras},
 series = {ICDSC '15},
 year = {2015},
 isbn = {978-1-4503-3681-9},
 location = {Seville, Spain},
 pages = {175--180},
 numpages = {6},
 url = {http://doi.acm.org/10.1145/2789116.2789136},
 doi = {10.1145/2789116.2789136},
 acmid = {2789136},
 publisher = {ACM},
 address = {New York, NY, USA},
} 

Visual Recognition in the EAGLE Project

G. Amato, Paolo Bolettieri, Fabrizio Falchi, F. Rabitti, Lucia Vadicamo
6th Italian Information Retrieval Workshop (IIR 2015) Cagliari, Italy, May 25-26, 2015
CEUR Workshop Proceedings, Volume 1404, 2015, Scopus: 2-s2.0-84938526788, ISSN: 1613-0073


@inproceedings{DBLP:conf/iir/AmatoBFRV15,
 author = {Giuseppe Amato and
 Paolo Bolettieri and
 Fabrizio Falchi and
 Fausto Rabitti and
 Lucia Vadicamo},
 title = {Visual Recognition in the {EAGLE} Project},
 booktitle = {Proceedings of the 6th Italian Information Retrieval Workshop, Cagliari,
 Italy, May 25-26, 2015},
 year = {2015},
 series = {{CEUR} Workshop Proceedings},
 volume = {1404},
 publisher = {CEUR-WS.org},
 year = {2015},
}


Bifocal search: embedding context in local descriptors

Fabrizio Falchi
ISTI Research Report n. 443866

@techreport{instance1290,
 title = {{Bifocal search: embedding context in local descriptors}},
 author = {Fabrizio Falchi},
 group = {ISTI},
 year = {2015},
 institution = {Consiglio Nazionale delle Ricerche}
}

Some Theoretical and Experimental Observations on Permutation Spaces and Similarity Search

G. Amato, Fabrizio Falchi, F. Rabitti, L. Vadicamo
Similarity Search and Applications, 7th International Conference, SISAP 2014, Los Cabos, Mexico, October 29-31, 2014. Proceedings Lecture Notes in Computer Science, vol. 8821, Springer International Publishing, 2014, pp. 37-49 - ISSN: 0302-9743, ISBN: 978-3-319-11987-8
WOS: 000345117600004, Scopus: 2-s2.0-84911192218, DOI: 10.1007/978-3-319-11988-5_4

@Inbook{2014-SISAP,
 author="Amato, Giuseppe and Falchi, Fabrizio and Rabitti, Fausto and Vadicamo, Lucia",
 editor="Traina, Agma Juci Machado and Traina, Caetano and Cordeiro, Robson Leonardo Ferreira",
 title="Some Theoretical and Experimental Observations on Permutation Spaces and Similarity Search",
 bookTitle="Similarity Search and Applications: 7th International Conference, SISAP 2014, Los Cabos, Mexico, October 29-31, 2014. Proceedings",
 year="2014",
 publisher="Springer International Publishing",
 address="Cham",
 pages="37--49",
 isbn="978-3-319-11988-5",
 doi="10.1007/978-3-319-11988-5_4",
 url="https://doi.org/10.1007/978-3-319-11988-5_4"
}

Indexing Vectors of Locally Aggregated Descriptors Using Inverted Files

G. Amato, F. Falchi, C. Gennaro, P. Bolettieri
International Conference on Multimedia Retrieval, ICMR '14, Glasgow, United Kingdom - April 01 - 04, 2014. ACM New York, NY, USA 2014: pages 439-442
ISBN: 978-1-4503-2782-4, Scopus: 2-s2.0-84899764244, DOI: 10.1145/2578726.2578788

@inproceedings{Amato:2014:IVL:2578726.2578788,
 author = {Amato, Giuseppe and Falchi, Fabrizio and Gennaro, Claudio and Bolettieri, Paolo},
 title = {Indexing Vectors of Locally Aggregated Descriptors Using Inverted Files},
 booktitle = {Proceedings of International Conference on Multimedia Retrieval},
 series = {ICMR '14},
 year = {2014},
 isbn = {978-1-4503-2782-4},
 location = {Glasgow, United Kingdom},
 pages = {439:439--439:442},
 articleno = {439},
 numpages = {4},
 url = {http://doi.acm.org/10.1145/2578726.2578788},
 doi = {10.1145/2578726.2578788},
 acmid = {2578788},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {image classification, landmarks recognition, local features},
} 

Aggregating Local Descriptors for Epigraphs Recognition

G. Amato, Fabrizio Falchi, F. Rabitti, Lucia Vadicamo
In Digital Presentation and Preservation of Cultural and Scientific Heritage, Fourth International Conference Digital Presentation and Preservation of Cultural and Scientific Heritage, DiPP2014 (September 18–21, 2014, Veliko Tarnovo, Bulgaria), Institute of Mathematics and Informatics Bulgarian Academy of Sciences, vol. 4, No 1, (2014), pages 49-58, ISSN: 1314-4006

@inproceedings{2014-DiPP,
 author = {Amato, Giuseppe and Falchi, Fabrizio and Rabitti, Fausto and Vadicamo, Lucia},
 title = {Aggregating Local Descriptors for Epigraphs Recognition},
 booktitle = {Digital Presentation and Preservation of Cultural and Scientific Heritage},
 series = {DiPP 2014},
 year = {2014},
 issn = {1314-4006},
 location = {Veliko Tarnovo, Bulgaria},
 pages = {49-58},
 url = {http://sci-gems.math.bas.bg/jspui/handle/10525/2411},
 publisher = {Institute of Mathematics and Informatics Bulgarian Academy of Sciences},
}

Pivot selection strategies for permutation-based similarity search

G. Amato, A. Esuli, F. Falchi
Similarity Search and Applications, 8th International Conference (SISAP 2013)
Lecture Notes in Computer Science, vol. 8199, pages 91-102 - Springer International Publishing AG Switzerland. ISSN 0302-9743, ISBN 978-3-642-41061-1
WOS: 000338111900010, Scopus: 2-s2.0-84886413518, DOI: 10.1007/978-3-642-41062-8_10

@Inbook{2013SisapPivot,
 author="Amato, Giuseppe and Esuli, Andrea and Falchi, Fabrizio",
 editor="Brisaboa, Nieves and Pedreira, Oscar and Zezula, Pavel",
 title="Pivot Selection Strategies for Permutation-Based Similarity Search",
 bookTitle="Similarity Search and Applications: 6th International Conference, SISAP 2013, A Coru{\~{n}}a, Spain, October 2-4, 2013, Proceedings",
 year="2013",
 publisher="Springer Berlin Heidelberg",
 address="Berlin, Heidelberg",
 pages="91--102",
 isbn="978-3-642-41062-8",
 doi="10.1007/978-3-642-41062-8_10",
 url="https://doi.org/10.1007/978-3-642-41062-8_10"
}

Large Scale Image Retrieval Using Vector of Locally Aggregated Descriptors

G. Amato, P. Bolettieri, O. Pedreira, P. Zezula
Similarity Search and Applications, 8th International Conference (SISAP 2013)
Lecture Notes in Computer Science, vol. 8199, pages 245-256 - Springer International Publishing AG Switzerland. ISSN 0302-9743, ISBN 978-3-642-41061-1
WOS: 000338111900025, Scopus: 2-s2.0-84981249591, DOI: 10.1007/978-3-642-41062-8_25

@Inbook{Amato2013,
 author="Amato, Giuseppe and Bolettieri, Paolo and Falchi, Fabrizio and Gennaro, Claudio",
 editor="Brisaboa, Nieves and Pedreira, Oscar and Zezula, Pavel",
 title="Large Scale Image Retrieval Using Vector of Locally Aggregated Descriptors",
 bookTitle="Similarity Search and Applications: 6th International Conference, SISAP 2013, A Coru{\~{n}}a, Spain, October 2-4, 2013, Proceedings",
 year="2013",
 publisher="Springer Berlin Heidelberg",
 address="Berlin, Heidelberg",
 pages="245--256",
 isbn="978-3-642-41062-8",
 doi="10.1007/978-3-642-41062-8_25",
 url="https://doi.org/10.1007/978-3-642-41062-8_25"
}

On Reducing the Number of Visual Words in the Bag-of-Features Representation

G. Amato, F. Falchi, C. Gennaro
VISAPP 2013 - Proceedings of the International Conference on Computer Vision Theory and Applications, Volume 1, Barcelona, Spain, 21-24 February, 2013, SCITEPRESS – Science and Technology Publications, Portugal, 2013, pages 657-662, ISBN: 978-989-8565-47-1, pages 657-662
Scopus: 2-s2.0-84878259406, DOI: 10.5220/0004290506570662

@inproceedings{DBLP:conf/visapp/AmatoFG13,
 author = {Giuseppe Amato and
 Fabrizio Falchi and
 Claudio Gennaro},
 title = {On Reducing the Number of Visual Words in the Bag-of-Features Representation},
 booktitle = {{VISAPP} 2013 - Proceedings of the International Conference on Computer
 Vision Theory and Applications, Volume 1, Barcelona, Spain, 21-24
 February, 2013.},
 pages = {657--662},
 year = {2013},
 editor = {Sebastiano Battiato and
 Jos{\'{e}} Braz},
 publisher = {SciTePress},
 year = {2013},
 doi		= {10.5220/0004290506570662},
 isbn = {978-989-8565-47-1},
}

Using Visual Attention in a CBIR System - Experimental Results on Landmark and Object Recognition Tasks

F.A. Cardillo, G. Amato, F. Falchi
VISAPP 2013 - Proceedings of the International Conference on Computer Vision Theory and Applications, Volume 1, Barcelona, Spain, 21-24 February, 2013, SCITEPRESS – Science and Technology Publications, Portugal, 2013, pages 468-471, ISBN: 978-989-8565-47-1,
Scopus: 2-s2.0-84878226796, DOI: 10.5220/0004299404680471


@inproceedings{DBLP:conf/visapp/CardilloAF13,
 author = {Franco Alberto Cardillo and
 Giuseppe Amato and
 Fabrizio Falchi},
 title = {Using Visual Attention in a {CBIR} System - Experimental Results on
 Landmark and Object Recognition Tasks},
 booktitle = {{VISAPP} 2013 - Proceedings of the International Conference on Computer
 Vision Theory and Applications, Volume 1, Barcelona, Spain, 21-24
 February, 2013.},
 pages = {468--471},
 year = {2013},
 editor = {Sebastiano Battiato and
 Jos{\'{e}} Braz},
 publisher = {SciTePress},
 year = {2013},
 doi		= {10.5220/0004299404680471},
 isbn = {978-989-8565-47-1},
}

Evaluating inverted files for visual compact codes on a large scale

G. Amato, P. Bolettieri, F. Falchi, C. Gennaro
LSDS-IR: Large-Scale and Distributed Systems for Information Retrieval, 10th Workshop colocated with ACM WSDM 2013

@inproceedings{2013-LSDS-IR-Amato,
 author = {Giuseppe Amato and Paolo Bolettieri and Fabrizio Falchi and Claudio Gennaro},
 title = {Evaluating inverted files for visual compact codes on a large scale},
 booktitle = {{LSDS-IR} 2013 - 10th international workshop on large-scale and distributed systems for information retrieval (LSDS-IR), co-located with ACM WSDM, Roma, Italy, 5 February, 2013.},
 pages = {44-49},
 year = {2013},
 urn 		= {https://pdfs.semanticscholar.org/cb5e/44949138af606e1b39a9603a442e593d9653.pdf#page=44},
}

Automatic Aerial Image Alignment for GeoMemories

G. Amato, Fabrizio Falchi, F. Rabitti, Andrea Marchetti, Maurizio Tesconi
MMEDIA 2013, The Fifth International Conferences on Advances in Multimedia. Venice, Italy, April 21-26, 2013 ISSN: 2308-4448
Scopus: 2-s2.0-84905852103,

In the last few years, aerial and satellite photographs have become more an more important for historical records. The availability of Geographical Information Systems and the increasing number of photos made per year allows very advanced fruition of large number of contents. In this paper we illustrate the GeoMemories approach and we focus on its automatic image alignment architecture. The approach leverages on a set of georeferenced images used as knowledge base. Local features are used in combination with compact codes and space transformation to achieve high level of efficiency.
>@inproceedings{
	 author = {Giuseppe Amato and Fabrizio Falchi and Fausto Rabitti and Anrea Marchetti and Maurizio Tesconi},
 title = {Automatic Aerial Image Alignment for GeoMemories},
 booktitle = {MMEDIA 2013, The Fifth International Conferences on Advances in Multimedia},
 pages = {62-66},
 year = {2013},
 urn 		= {https://www.thinkmind.org/index.php?view=article&articleid=mmedia_2013_3_30_40065},
}

On kNN Classification and Local Feature Based Similarity Functions

G. Amato, F. Falchi
In Communications in Computer and Information Science, Volume 271, 2013, revised Selected papers ICAART 2011, Springer-Verlag Berlin Heidelberg (New York, NY, USA)pages 224-239. ISSN 1865-0929, ISBN 978-3-642-29965-0, Scopus: 2-s2.0-84880472773, DOI: 10.1007/978-3-642-29966-7_15

@Inbook{Amato2013,
 author="Amato, Giuseppe and Falchi, Fabrizio",
 editor="Filipe, Joaquim and Fred, Ana",
 title="On kNN Classification and Local Feature Based Similarity Functions",
 bookTitle="Agents and Artificial Intelligence: Third International Conference, ICAART 2011, Rome, Italy, January, 28-30, 2011. Revised Selected papers",
 year="2013",
 publisher="Springer Berlin Heidelberg",
 address="Berlin, Heidelberg",
 pages="224--239",
 isbn="978-3-642-29966-7",
 doi="10.1007/978-3-642-29966-7_15",
 url="https://doi.org/10.1007/978-3-642-29966-7_15"
}

Visual Features Selection

G. Amato, F. Falchi, C. Gennaro
4th Italian Information Retrieval Workshop, IIR 2013. Pisa, Italy, January 16-17, 2013, CEUR Workshop Proceedings, Volume 964, 2013, pages 41-44, Scopus: 2-s2.0-84922765206, ISSN: 1613-0073

@inproceedings{2016-IIR-FVS,
 author = {Giuseppe Amato and Fabrizio Falchi and Claudio Gennaro},
 title = {Visual Features Selection},
 booktitle = {Proceedings of the 4th Italian Information Retrieval Workshop, Pisa,
 Italy, Jan 16-17, 2013.},
 year = {2013},
 series = {{CEUR} Workshop Proceedings},
 volume = {964},
 pages		= {41-44},
 publisher = {CEUR-WS.org},
}

Experimenting a Visual Attention Model in the Context of CBIR Systems

F.A. Cardillo, G. Amato, F. Falchi
4th Italian Information Retrieval Workshop, IIR 2013. Pisa, Italy, January 16-17, 2013, CEUR Workshop Proceedings, Volume 964, 2013, pages 45-56, Scopus: 2-s2.0-84922785036, ISSN: 1613-0073

@inproceedings{2013-IIR-Att,
 author = {Franco Alberto Cardillo and Giuseppe Amato and Fabrizio Falchi},
 title = {Visual Features Selection},
 booktitle = {Proceedings of the 4th Italian Information Retrieval Workshop, Pisa,
 Italy, Jan 16-17, 2013.},
 year = {2016},
 series = {{CEUR} Workshop Proceedings},
 volume = {964},
 pages		= {45-56},
 publisher = {CEUR-WS.org},
}

Autonomic preservation of access copies of digital contents

W. Allasia, F. Falchi, F. Gallo, C. Meghini
In Proceedings of Memory of the World in the Digital Age: Digitization and Preservation, 26-28 September 2012, Vancouver, BC, Canada, UNESCO, 2012, pages 976-987

@inproceedings{2012-UNESCO,
 author = {Walter Allasia and Fabrizio Falchi and Francesco Gallo and Carlo Meghini},
 title = {Autonomic preservation of access copies of digital contents},
 booktitle = {Proceedings of Memory of the World in the Digital Age: Digitization and Preservation, 26-28 September 2012, Vancouver, BC, Canada},
 year = {2012},
 pages = {976-987},
 publisher = {UNESCO},
}

Landmark Recognition in VISITO Tuscany

G. Amato, F. Falchi, F. Rabitti
In Multimedia for Cultural Heritage, First International Workshop, MM4CH 2011. Modena, Italy, May 3, 2011. Revised Selected papers. Communications in Computer and Information Science, Volume 247, Part 1, pages 1-13, Springer-Verlag Berlin Heidelberg (New York, NY, USA), 2012. ISSN: 1865-0929, ISBN: 978-3-642-27977-5.
WOS: 000309892800001, Scopus: 2-s2.0-84856473133, DOI: 10.1007/978-3-642-27978-2_1

@Inbook{Amato2012,
 author="Amato, Giuseppe and Falchi, Fabrizio and Rabitti, Fausto",
 editor="Grana, Costantino and Cucchiara, Rita",
 title="Landmark Recognition in VISITO Tuscany",
 bookTitle="Multimedia for Cultural Heritage: First International Workshop, MM4CH 2011, Modena, Italy, May 3, 2011, Revised Selected papers",
 year="2012",
 publisher="Springer Berlin Heidelberg",
 address="Berlin, Heidelberg",
 pages="1--13",
 isbn="978-3-642-27978-2",
 doi="10.1007/978-3-642-27978-2_1",
 url="https://doi.org/10.1007/978-3-642-27978-2_1"
}

Geometric consistency checks for kNN based image classification relying on local features

G. Amato, F. Falchi, C. Gennaro
In Proceedings of the Fourth International Conference on SImilarity Search and APplications (SISAP 2011), Lipari, Italy, 30 June – 1 July 2011, ACM, New York, NY, USA, 2010, pages 81-88.
ISBN: 978-1-4503-0795-6, Scopus: 2-s2.0-79960951691, DOI: 10.1145/1995412.1995428

@inproceedings{Amato:2011:GCC:1995412.1995428,
 author = {Amato, Giuseppe and Falchi, Fabrizio and Gennaro, Claudio},
 title = {Geometric Consistency Checks for kNN Based Image Classification Relying on Local Features},
 booktitle = {Proceedings of the Fourth International Conference on SImilarity Search and APplications},
 series = {SISAP '11},
 year = {2011},
 isbn = {978-1-4503-0795-6},
 location = {Lipari, Italy},
 pages = {81--88},
 numpages = {8},
 url = {http://doi.acm.org/10.1145/1995412.1995428},
 doi = {10.1145/1995412.1995428},
 acmid = {1995428},
 publisher = {ACM},
 address = {New York, NY, USA},
} 

Combining local and global visual feature similarity using a text search engine

G. Amato, P. Bolettieri, F. Falchi, C. Gennaro, F. Rabitti
In Content-Based Multimedia Indexing (CBMI), 9th International Workshop on, Madrid, Spain, 13-15 June 2011. IEEE Computer Society (New York, NY, USA), 2011, pages 49-54.
ISBN: 978-1-61284-432-9, Scopus: 2-s2.0-80052287452, DOI: 10.1109/CBMI.2011.5972519

@INPROCEEDINGS{5972519,
author={G. Amato and P. Bolettieri and F. Falchi and C. Gennaro and F. Rabitti},
booktitle={2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI)},
title={Combining local and global visual feature similarity using a text search engine},
year={2011},
pages={49-54},
keywords={content-based retrieval;feature extraction;image retrieval;search engines;text analysis;Lucene retrieval engine;content based retrieval systems;global visual feature similarity;image content processing;local visual feature similarity;text search engine;Feature extraction;Image color analysis;Indexing;Transform coding;Visualization;Vocabulary;Access Methods;Approximate Similarity Search;Lucene},
doi={10.1109/CBMI.2011.5972519},
ISSN={1949-3983},
month={June},}

Landmark recognition in VISITO: VIsual Support to Interactive TOurism in Tuscany

G. Amato, P. Bolettieri, F. Falchi
Proceedings of the 1st ACM International Conference on Multimedia Retrieval, ICMR 2011, Trento, Italy, April 18-20, 2011 ACM New York, NY, USA, 2011: demo paper ID 661. ISBN: 978-1-4503-0336-1
Scopus: 2-s2.0-79959700609, DOI: 10.1145/1991996.1992057

@inproceedings{Amato:2011:LRV:1991996.1992057,
 author = {Amato, Giuseppe and Bolettieri, Paolo and Falchi, Fabrizio},
 title = {Landmark Recognition in VISITO: VIsual Support to Interactive TOurism in Tuscany},
 booktitle = {Proceedings of the 1st ACM International Conference on Multimedia Retrieval},
 series = {ICMR '11},
 year = {2011},
 isbn = {978-1-4503-0336-1},
 location = {Trento, Italy},
 pages = {61:1--61:2},
 articleno = {61},
 numpages = {2},
 url = {http://doi.acm.org/10.1145/1991996.1992057},
 doi = {10.1145/1991996.1992057},
 acmid = {1992057},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {classifiation, image classification, interactive tourism, landmarks recognition},
} 

Local Feature based Image Similarity Functions for kNN Classification

G. Amato, F. Falchi
ICAART 2011 - Proceedings of the 3rd International Conference on Agents and Artificial Intelligence, Volume 1 – Artificial Intelligence, Rome, Italy, January 28-30, 2011 Sci-TePress (Portugal), 2011, pp. 157-166. ISBN: 978-989-8425-40-9, DOI: Scopus: 2-s2.0-79960148321
Scopus: 2-s2.0-79960148321, DOI: 10.5220/0003185401570166

@InProceedings{icaart11,
 author={Giuseppe Amato and Fabrizio Falchi},
 title={Local Feature based Image Similarity Functions for kNN Classification},
 booktitle={Proceedings of the 3rd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},
 year={2011},
 pages={157-166},
 publisher={SciTePress},
 organization={INSTICC},
 doi={10.5220/0003185401570166},
 isbn={978-989-8425-40-9},
}

Indexing support vector machines for efficient top-k classification

G. Amato, P. Bolettieri, F. Falchi, F. Rabitti, P. Savino
In Proceedings of MMEDIA - International Conferences on Advances in Multimedia 3rd International Conferences on Advances in Multimedia, MMEDIA 2011; Budapest; Hungary; 17 April 2011 through 22 April 2011. IRARIA, 2011, pages 56-61. ISSN: 23084448 ISBN: 978-161208129-8, WOS: 000397458200010, Scopus: 2-s2.0-84893309725,

This paper proposes an approach to efficiently execute approximate top-k classification (that is, identifying the best k elements of a class) using Support Vector Machines, in web-scale datasets, without significant loss of effectiveness. The novelty of the proposed approach, with respect to other approaches in literature, is that it allows speeding-up several classifiers, each one defined with different kernels and kernel parameters, by using one single index.
@InProceedings{2011-MMEDIA,
author={Amato, G. and Bolettieri, P. and Falchi, F. and Rabitti, F. and Savino, P.},
title={Indexing support vector machines for efficient top-k classification},
journal={MMEDIA - International Conferences on Advances in Multimedia},
year={2011},
pages={56-61},
}

kNN based image classification relying on local feature similarity

G. Amato, F. Falchi
In Proceedings of the Third International Conference on SImilarity Search and APplications (SISAP 2010), Istanbul, Turkey, 18-19 September 2010. ACM, New York, NY, USA, 2010, ISBN: 978-1-4503-0420-7: pages 101-108. ISBN: 978-1-4503-0420-7
Scopus: 2-s2.0-78649874974, DOI: 10.1145/1862344.1862360

@inproceedings{Amato:2010:KBI:1862344.1862360,
 author = {Amato, Giuseppe and Falchi, Fabrizio},
 title = {kNN Based Image Classification Relying on Local Feature Similarity},
 booktitle = {Proceedings of the Third International Conference on SImilarity Search and APplications},
 series = {SISAP '10},
 year = {2010},
 isbn = {978-1-4503-0420-7},
 location = {Istanbul, Turkey},
 pages = {101--108},
 numpages = {8},
 url = {http://doi.acm.org/10.1145/1862344.1862360},
 doi = {10.1145/1862344.1862360},
 acmid = {1862360},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {image classification, image indexing, landmarks, local features, recognition},
} 

Recognizing Landmarks Using Automated Classification Techniques: Evaluation of Various Visual Features

G. Amato, F. Falchi, P. Bolettieri
In Proceedings of the Second International Conferences on Advances in Multimedia (MMEDIA 2010), Athens/Glyfada, Greece, 13-19 June 2010. IEEE Computer Society (New York, NY, USA), 2010: pp. 78-83.
ISBN: 978-0-7695-4068-9, Scopus: 2-s2.0-77955261612, DOI: 10.1109/MMEDIA.2010.20

@inproceedings{Amato:2010:RLU:1848647.1848930,
 author = {Amato, Giuseppe and Falchi, Fabrizio and Bolettieri, Paolo},
 title = {Recognizing Landmarks Using Automated Classification Techniques: Evaluation of Various Visual Features},
 booktitle = {Proceedings of the 2010 Second International Conferences on Advances in Multimedia},
 series = {MMEDIA '10},
 year = {2010},
 isbn = {978-0-7695-4068-9},
 pages = {78--83},
 numpages = {6},
 url = {http://dx.doi.org/10.1109/MMEDIA.2010.20},
 doi = {10.1109/MMEDIA.2010.20},
 acmid = {1848930},
 publisher = {IEEE Computer Society},
 address = {Washington, DC, USA},
 keywords = {Image indexing, image classification, recognition, landmarks},
} 

Image classification via adaptive ensembles of descriptor-specific classifiers

T. Fagni, F. Falchi, F. Sebastiani
In Pattern Recognition and Image Analysis: Advances in Mathematical Theory and Applications (PRIA), vol. 20, n. 1 (2010) MAIK Nauka/Interperiodica distributed exclusively by Springer Science+Business Media LLC. pp. 21-28. ISSN: 1054-6618, eISSN: 1555-6212
WOS: RSCI:15270545, Scopus: 2-s2.0-77952198723, DOI: 10.1134/S1054661810010025

@Article{Fagni2010,
 author="Fagni, T. and Falchi, F. and Sebastiani, F.",
 title="Image classification via adaptive ensembles of descriptor-specific classifiers",
 journal="Pattern Recognition and Image Analysis",
 year="2010",
 month="Mar",
 day="01",
 volume="20",
 number="1",
 pages="21--28",
 issn="1555-6212",
 doi="10.1134/S1054661810010025",
 url="https://doi.org/10.1134/S1054661810010025"
}

Searching 100M Images by Content Similarity

P. Bolettieri, F. Falchi, C. Lucchese, Y. Mass, R. Perego, F. Rabitti, M. Shmueli-Scheuer
In Post-proceedings of the 5th Italian Research Conference on Digital Library Systems - IRCDL 2009, Padova, Italy, January 29-30, 2009. Revised selected papers. DELOS: an Association for Digital Libraries 2009: pp. 88-99. ISBN: 978-88-903541-7-5

@inproceedings{DBLP:conf/ircdl/BolettieriFLMPRS09,
 author = {Paolo Bolettieri and
 Fabrizio Falchi and
 Claudio Lucchese and
 Yosi Mass and
 Raffaele Perego and
 Fausto Rabitti and
 Michal Shmueli{-}Scheuer},
 title = {Searching 100M Images by Content Similarity},
 booktitle = {Post-proceedings of the Fifth Italian Research Conference on Digital
 Libraries - {IRCDL} 2009, Padova, Italy, 29-30 January 2009},
 pages = {88--99},
 year = {2009},
 publisher = {{DELOS:} an Association for Digital Libraries / Department of Information
 Engineering of the University of Padua},
}

Caching content-based queries for robust and efficient image retrieval

F. Falchi, C. Lucchese, S. Orlando, R. Perego, F. Rabitti
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology (EDBT2009), March 23-26 2009, Saint-Petersburg, Russia. (Extending Database Technology; Vol. 360) ACM, New York, NY, USA, 2009, full paper: pp. 780-790. ISBN: 978-1-60558-422-5
Scopus: 2-s2.0-70349122915, DOI: 10.1145/1516360.1516450

@inproceedings{Falchi:2009:CCQ:1516360.1516450,
 author = {Falchi, Fabrizio and Lucchese, Claudio and Orlando, Salvatore and Perego, Raffaele and Rabitti, Fausto},
 title = {Caching Content-based Queries for Robust and Efficient Image Retrieval},
 booktitle = {Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology},
 series = {EDBT '09},
 year = {2009},
 isbn = {978-1-60558-422-5},
 location = {Saint Petersburg, Russia},
 pages = {780--790},
 numpages = {11},
 url = {http://doi.acm.org/10.1145/1516360.1516450},
 doi = {10.1145/1516360.1516450},
 acmid = {1516450},
 publisher = {ACM},
 address = {New York, NY, USA},
}

Adaptive committees of feature-specific classifiers for image classification

T. Fagni, F. Falchi, F. Sebastiani
In Image Mining. Theory and Applications. Proceedings of the 2nd International Workshop on Image Mining Theory and Applications. IMTA-09. In conjunction with VISGRAPP 2009, Lisboa – Portugal, February 2009, INSTICC Press (Portugal), 2009, full paper, pp. 113-122. ISBN: 978-989-8111-42-5
WOS: 000267753900013, Scopus: 2-s2.0-67650548640, DOI: 10.5220/0001968501130122

@InProceedings{imta09,
author={Tiziano Fagni and Fabrizio Falchi and Fabrizio Sebastiani},
title={Adaptive Committees of Feature-specific Classifiers for Image Classification},
booktitle={Proceedings of the 2nd International Workshop on Image Mining Theory and Applications - Volume 1: Workshop IMTA, (VISIGRAPP 2009)},
year={2009},
pages={113-122},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001968501130122},
isbn={978-989-8111-80-7},
}

Caching Algorithms for Similarity Search

F. Falchi, C. Lucchese, S. Orlando, R. Perego, F. Rabitti
SEBD 2009, Proceedings of the 17th Italian Symposium on Advanced Database Systems, Camogli (Genova, Italia), June 21-24, 2009 Seneca Edizioni, 2009, extended abstract: pp. 145-152. ISBN: 978-88-6122-154-3, Scopus: 2-s2.0-84893423219

Similarity search in metric spaces is a general paradigm that can be used in several application fields. One of them is content-based image retrieval systems. In order to become an effective complement to traditional Web-scale text-based image retrieval solutions, content-based image retrieval must be efficient and scalable. In this paper we investigate caching the answers to content-based image retrieval queries in metric space, with the aim of reducing the average cost of query processing, and boosting the overall system throughput. Our proposal allows the cache to return approximate answers with acceptable quality guarantee even if the query processed has never been encountered in the past. By conducting tests on a collection of one million high-quality digital photos, we show that the proposed caching techniques can have a significant impact on performance. Moreover, we show that our caching algorithm does not suffer of cache pollution problems due to near-duplicate query objects.
inproceedings{DBLP:conf/sebd/FalchiLOPR09,
 author = {Fabrizio Falchi and
 Claudio Lucchese and
 Salvatore Orlando and
 Raffaele Perego and
 Fausto Rabitti},
 title = {Caching Algorithms for Similarity Search},
 booktitle = {Proceedings of the Seventeenth Italian Symposium on Advanced Database
 Systems, {SEBD} 2009, Camogli, Italy, June 21-24, 2009},
 pages = {145--152},
 publisher = {Edizioni Seneca},
 year = {2009},
 isbn = {978-88-6122-154-3},
 timestamp = {Thu, 11 Mar 2010 12:55:33 +0100},
 biburl = {http://dblp2.uni-trier.de/rec/bib/conf/sebd/2009},
 bibsource = {dblp computer science bibliography, http://dblp.org}
}

A metric cache for similarity search

F. Falchi, C. Lucchese, S. Orlando, R. Perego, F. Rabitti
In International Conference on Information and Knowledge Management. Proceeding of the 2008 ACM Workshop on Large-Scale distributed systems for information retrieval (LSDS-IR'08), Napa Valley, California, USA, October 30, 2008. ACM (New York, NY, USA), 2008: full paper, pp. 43-50.

ISBN: 978-1-60558-254-2, Scopus: 2-s2.0-84893423219, DOI: 10.1145/1458469.1458473

@inproceedings{Falchi:2008:MCS:1458469.1458473,
 author = {Falchi, Fabrizio and Lucchese, Claudio and Orlando, Salvatore and Perego, Raffaele and Rabitti, Fausto},
 title = {A Metric Cache for Similarity Search},
 booktitle = {Proceedings of the 2008 ACM Workshop on Large-Scale Distributed Systems for Information Retrieval},
 series = {LSDS-IR '08},
 year = {2008},
 isbn = {978-1-60558-254-2},
 location = {Napa Valley, California, USA},
 pages = {43--50},
 numpages = {8},
 url = {http://doi.acm.org/10.1145/1458469.1458473},
 doi = {10.1145/1458469.1458473},
 acmid = {1458473},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {content-based retrieval, metric spaces, query-result caching},

Audio-visual content analysis in P2P networks: the SAPIR approach

W. Allasia, F. Falchi, F. Gallo, M. Kacimi, A. Kaplan, J. Mamou, Y. Mass, N. Orio
In Proceedings of 19th International Conference on Database and Expert Systems Application - DEXA 2008 (1-5 September, 2008). First International Workshop on Automated Information Extraction in Media Production - AIEMPro'08, IEEE Computer Society (New York, NY, USA), Turin, Italy, September 01-05, 2008: full paper, pp. 610-614. ISBN: 0-7695-3030-3, ISSN: 1529-4188. WOS: 000259487400103, Scopus: 2-s2.0-57849156290, DOI: 10.1109/DEXA.2008.123

@INPROCEEDINGS{4624785,
 author={W. Allasia and F. Falchi and F. Gallo and M. Kacimi and A. Kaplan and J. Mamou and Y. Mass and N. Orio},
 booktitle={2008 19th International Workshop on Database and Expert Systems Applications},
 title={Audio-Visual Content Analysis in P2P Networks: The SAPIR Approach},
 year={2008},
 pages={610-614},
 doi={10.1109/DEXA.2008.123},
 ISSN={1529-4188},
 month={Sept},}

Using MPEG-7 for Automatic Annotation of Audiovisual Content in eLearning Digital Libraries

G. Amato, P. Bolettieri, F. Debole, F. Falchi, C. Gennaro, F. Rabitti
In Post-proceedings of the Fourth Italian Research Conference on Digital Library Systems, IRCDL 2008, Padova, Italy, January 24-25, 2008 DELOS: an Association for Digital Libraries 2008: full paper, pp. 1-12.

@inproceedings{DBLP:conf/ircdl/AmatoBDFGR08,
 author = {Giuseppe Amato and Paolo Bolettieri and Franca Debole and Fabrizio Falchi and Claudio Gennaro and Fausto Rabitti},
 title = {Using {MPEG-7} for Automatic Annotation of Audiovisual Content in eLearning Digital Libraries},
 booktitle = {Post-proceedings of the Forth Italian Research Conference on Digital Library Systems, {IRCDL} 2008, Padova, Italy, 24-25 January 2008},
 pages = {1--12},
 editor = {Maristella Agosti and Floriana Esposito and Costantino Thanos},
 publisher = {{DELOS:} an Association for Digital Libraries},
 year = {2008}
}

Crawling, indexing, and similarity searching images on the web

M. Batko, F. Falchi, C. Lucchese, D. Novak, R. Perego, F. Rabitti, J. Sedmidubsky, P. Zezula
In SEBD 2008, Proceedings of the 16th Italian Symposium on Advanced Database Systems (Mondello, June 22-25, 2008 Fotograf (Palermo, Italy, 2008), 2008: extended abstract, pp. 382-389. Scopus: 2-s2.0-84864285637,

@INPROCEEDINGS{Batko08crawling,indexing,,
 author = {M. Batko and F. Falchi and C. Lucchese and D. Novak and R. Perego and F. Rabitti and J. Sedmidubsky and P. Zezula},
 title = {Crawling, indexing, and similarity searching images on the web},
 booktitle = {In Proceedings of SEDB ’08, the 16th Italian Symposium on Advanced Database Systems},
 year = {2008},
 pages = {382--389}
}

Efficient video-stream filtering

F. Falchi, C. Gennaro, P. Savino, P. Stanchev
In IEEE Multimedia Volume 15, No. 1 (Janury 2008). IEEE Computer Society (New York, NY, USA), 2008: pp. 52-62. ISSN: 1070-986X, WOS: 000253852700007, Scopus: 2-s2.0-42349091931, DOI: 10.1109/MMUL.2008.6

@ARTICLE{4476273,
author={F. Falchi and C. Gennaro and P. Savino and P. Stanchev},
journal={IEEE MultiMedia},
title={Efficient Video-Stream Filtering},
year={2008},
volume={15},
number={1},
pages={52-62},
keywords={information filtering;multimedia computing;video streaming;information filtering;metric distance;video content representation;video-stream filtering;Councils;Information filtering;Information filters;Information science;Layout;MPEG 7 Standard;Nonlinear filters;Space technology;Streaming media;TV;MPEG-7;information filtering;metric space;pivot filtering;similarity search},
doi={10.1109/MMUL.2008.6},
ISSN={1070-986X},
month={Jan},}

An Innovative Approach for Indexing and Searching Digital Rights

W. Allasia, F. Chiariglione, F. Falchi, F. Gallo
In Third International Conference on Automated Production of Cross Media Content for Multi-Channel Distribution (AXMEDIS'07) Barcelona, Spain, 28-30 November 2007 IEEE Computer Society (Los Alamitos, CA), 2007: full paper, pp. 147-154. Scopus: 2-s2.0-47849111213, DOI: 10.1109/AXMEDIS.2007.17

@INPROCEEDINGS{4402871,
author={W. Allasia and F. Gallo and F. Chiariglione and F. Falchi},
booktitle={Automated Production of Cross Media Content for Multi-Channel Distribution, 2007. AXMEDIS '07. Third International Conference on},
title={An Innovative Approach for Indexing and Searching Digital Rights},
year={2007},
pages={147-154},
keywords={database indexing;meta data;query formulation;centralized systems;indexing;metadata management;searching digital rights;similarity searches;text searches;Conference management;Content management;Indexing;Information management;Innovation management;Intellectual property;Licenses;Production systems;Streaming media;Video sharing},
doi={10.1109/AXMEDIS.2007.17},
month={Nov},}

SAPIR: Scalable and Distributed Image Searching

F. Falchi, M. Kacimi, Y. Mass, F. Rabitti, P. Zezula
In the Second International Conference on Semantic and Digital Media Technologies (SAMT 2007), Genova, Italy, 5-7 December 2007, Poster and Demo ProceedingsProceedings: demo paper, pp. 11-12. CEUR Workshop Proceedings, Volume 300, 2007. ISSN: 1613-0073, Scopus: 2-s2.0-79960115383,

@inproceedings{DBLP:conf/samt/FalchiKMRZ07,
 author = {Fabrizio Falchi and
 Mouna Kacimi and
 Yosi Mass and
 Fausto Rabitti and
 Pavel Zezula},
 title = {{SAPIR:} Scalable and Distributed Image Searching},
 booktitle = {Poster and Demo Proceedings of the 2nd International Conference on
 Semantic and Digital Media Technologies, Genoa, Italy, December 5-7,
 2007},
 year = {2007},
 crossref = {DBLP:conf/samt/2007p},
 url = {http://ceur-ws.org/Vol-300/p06.pdf},
 timestamp = {Mon, 30 May 2016 16:57:35 +0200},
 biburl = {http://dblp.uni-trier.de/rec/bib/conf/samt/FalchiKMRZ07},
 bibsource = {dblp computer science bibliography, http://dblp.org}
}

Automatic metadata extraction and indexing for reusing e-learning multimedia objects

P. Bolettieri, F. Falchi, C. Gennaro, F. Rabitti
In MS '07: Workshop on multimedia information retrieval on The many faces of multimedia semantics (ACM Multimedia 2007) ACM (New York, NY, USA), 2007: pp. 21-28. ISBN: 978-1-59593-782-7, Scopus: 2-s2.0-37849006464, DOI: 10.1145/1290067.1290072

@inproceedings{Bolettieri:2007:AME:1290067.1290072,
 author = {Bolettieri, Paolo and Falchi, Fabrizio and Gennaro, Claudio and Rabitti, Fausto},
 title = {Automatic Metadata Extraction and Indexing for Reusing e-Learning Multimedia Objects},
 booktitle = {Workshop on Multimedia Information Retrieval on The Many Faces of Multimedia Semantics},
 series = {MS '07},
 year = {2007},
 isbn = {978-1-59593-782-7},
 location = {Augsburg, Bavaria, Germany},
 pages = {21--28},
 numpages = {8},
 url = {http://doi.acm.org/10.1145/1290067.1290072},
 doi = {10.1145/1290067.1290072},
 acmid = {1290072},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {MPEG-7, automatic extraction, metadata, multimedia content management system, similarity search, user interface},
} 

A digital rights aware similarity measure for multimedia documents

W. Allasia, F. Falchi, F. Gallo, N. Orio
In MS '07: Workshop on multimedia information retrieval on The many faces of multimedia semantics (ACM Multimedia 2007) ACM (New York, NY, USA), 2007: pp. 73-80. ISBN: 978-1-59593-782-7, Scopus: 2-s2.0-37849048715, DOI: 10.1145/1290067.1290080

@inproceedings{Allasia:2007:DRA:1290067.1290080,
 author = {Allasia, Walter and Falchi, Fabrizio and Gallo, Francesco and Orio, Nicola},
 title = {A Digital Rights Aware Similarity Measure for Multimedia Documents},
 booktitle = {Workshop on Multimedia Information Retrieval on The Many Faces of Multimedia Semantics},
 series = {MS '07},
 year = {2007},
 isbn = {978-1-59593-782-7},
 location = {Augsburg, Bavaria, Germany},
 pages = {73--80},
 numpages = {8},
 url = {http://doi.acm.org/10.1145/1290067.1290080},
 doi = {10.1145/1290067.1290080},
 acmid = {1290080},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {digital rights, information retrieval, metric spaces, multimedia information systems},
} 

A distributed incremental nearest neighbor algorithm

F. Falchi, C. Gennaro, F. Rabitti, P. Zezula
In Proceedings of the 2nd international conference on Scalable information systems (Infoscale'07) June 6-8, 2007 Suzhou, China ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering) ICST, Brussels, Belgium, Belgium, 2007: full paper, article No. 82. ISBN: 978-1-59593-757-5 Scopus: 2-s2.0-78349246995, DOI: 10.4108/infoscale.2007.196

@inproceedings{Falchi:2007:DIN:1366804.1366910,
 author = {Falchi, Fabrizio and Gennaro, Claudio and Rabitti, Fausto and Zezula, Pavel},
 title = {A Distributed Incremental Nearest Neighbor Algorithm},
 booktitle = {Proceedings of the 2Nd International Conference on Scalable Information Systems},
 series = {InfoScale '07},
 year = {2007},
 isbn = {978-1-59593-757-5},
 location = {Suzhou, China},
 pages = {82:1--82:10},
 articleno = {82},
 numpages = {10},
 url = {http://dl.acm.org/citation.cfm?id=1366804.1366910},
 acmid = {1366910},
 publisher = {ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering)},
 address = {ICST, Brussels, Belgium, Belgium},
 keywords = {distributed systems},
}

Florentine Coats of Arms on the Web: Experimenting retrieval based on text or image content

Falchi, F. Rabitti, W. Schweibenz, J. Simane
In Open Innovation. Neue Perspektiven im Kontext von Information und Wissen. Beiträge des 10. Internationalen Symposiums für Informationswissenschaft (ISI 2007) und der 13. Jahrestagung der IuK-Initiative Wissenschaft, Köln, 30. Mai - 1. Juni 2007. Hrsg. von Achim Oßwald. (Schriften zur Informationswissenschaft 46). Konstanz: UKV. 1-13. ISBN: 978-3-86764-020-6

@inproceedings{2007-ISI-Falchi
 author = {Falchi, Fabrizio and Rabitti, Fausto and Schweibenz, Werner and Simane, Jan},
 title = {Florentine Coats of Arms on the Web: Experimenting retrieval based on text or image content},
 booktitle = {Open Innovation. Neue Perspektiven im Kontext von Information und Wissen. Beiträge des 10. Internationalen Symposiums für Informationswissenschaft (ISI 2007) und der 13. Jahrestagung der IuK-Initiative Wissenschaft},
 year = {2007},
 isbn = {978-3-86764-020-6},
 location = {Koln, Germany},
 }

A digital library framework for reusing e-learning video documents

P. Bolettieri, F. Falchi, C. Gennaro, F. Rabitti
In Creating New Learning Experiences on a Global Scale. Second European Conference on Technology Enhanced Learning, EC-TEL 2007, Crete, Greece, September 17-20, 2007. Proceedings, Lecture Notes in Computer Science, vol. 4753 Springer-Verlag Berlin Heidelberg (Germany), 2007: short paper, pp. 444-449. ISSN: 0302-9743 , ISBN: 978-3-540-75194-6, WOS: 000249725200035, Scopus: 2-s2.0-38349001450, DOI: 10.1007/978-3-540-75195-3_35

@Inbook{Bolettieri2007,
author="Bolettieri, Paolo and Falchi, Fabrizio and Gennaro, Claudio and Rabitti, Fausto",
editor="Duval, Erik and Klamma, Ralf and Wolpers, Martin",
title="A Digital Library Framework for Reusing e-Learning Video Documents",
bookTitle="Creating New Learning Experiences on a Global Scale: Second European Conference on Technology Enhanced Learning, EC-TEL 2007, Crete, Greece, September 17-20, 2007. Proceedings",
year="2007",
publisher="Springer Berlin Heidelberg",
address="Berlin, Heidelberg",
pages="444--449",
abstract="The objective of this paper is to demonstrate the reuse of digital content, as video documents or PowerPoint presentations, by exploiting existing technologies for automatic extraction of metadata (OCR, speech recognition, cut detection, MPEG-7 visual descriptors, etc.). The multimedia documents and the extracted metadata are then indexed and managed by the Multimedia Content Management System (MCMS) MILOS, specifically developed to support design and effective implementation of digital library applications. As a result, the indexed digital material can be retrieved by means of content based retrieval on the text extracted and on the MPEG-7 visual descriptors (via similarity search), assisting the user of the e-Learning Library (student or teacher) to retrieve the items not only on the basic bibliographic metadata (title, author, etc.).",
isbn="978-3-540-75195-3",
doi="10.1007/978-3-540-75195-3_35",
url="https://doi.org/10.1007/978-3-540-75195-3_35"
}

A Similarity Approach on Searching for Digital Rights

W. Allasia, F. Falchi, F. Gallo
Proceedings of I-MEDIA'07 and I-SEMANTICS ’07, International Conferences on New Media Technology and Semantic Systems. as part of TRIPLE-I 2007 (Graz, Austria, September 5-7, 2007) - 7th Workshop of the Multimedia Metadata Applications (M3A) - Journal of Universal computer Science. ISSN: 0948-695x.

@inproceedings{Allasia_asimilarity,
 author = {Walter Allasia and Fabrizio Falchi and Francesco Gallo},
 title = {A Similarity Approach on Searching for Digital Rights},
 year = {2007},
 booktitle = {Proceedings of I-MEDIA'07 and I-SEMANTICS ’07, International Conferences on New Media Technology and Semantic Systems.
				as part of TRIPLE-I 2007},
 year = {2006},
 issn = {0948-695x},
 publisher = {Know-Center},
}

}

Using MILOS to build a multimedia digital library application: The photobook experience

G. Amato, P. Bolettieri, F. Debole, F. Falchi, F. Rabitti, P. Savino
In Research and Advanced Technology for Digital Libraries, 10th European Conference on Digital Libraries, ECDL 2006, Alicante, Spain, September 17-22, 2006, Proceedings. Lecture Notes in Computer Science, vol. 4172 Springer-Verlag Berlin Heidelberg (Germany, 2006): full paper, pp. 379-390. ISBN: 3-540-44636-2, ISSN: 0302-9743,
WOS: 000241101500032, Scopus: 2-s2.0-33750230441, DOI: 10.1007/11863878_32

@Inbook{Amato2006,
author="Amato, Giuseppe and Bolettieri, Paolo and Debole, Franca and Falchi, Fabrizio and Rabitti, Fausto and Savino, Pasquale",
editor="Gonzalo, Julio and Thanos, Costantino and Verdejo, M. Felisa and Carrasco, Rafael C.",
title="Using MILOS to Build a Multimedia Digital Library Application: The PhotoBook Experience",
bookTitle="Research and Advanced Technology for Digital Libraries: 10th European Conference, ECDL 2006, Alicante, Spain, September 17-22, 2006. Proceedings",
year="2006",
publisher="Springer Berlin Heidelberg",
address="Berlin, Heidelberg",
pages="379--390",
isbn="978-3-540-44638-5",
doi="10.1007/11863878_32",
url="https://doi.org/10.1007/11863878_32"
}

On scalability of the similarity search in the world of peers

M. Batko, F. Falchi, D. Novak, P. Zezula
In Proceedings of the 1st international conference on Scalable information systems (InfoScale '06) Hong Kong (China), May 30 – June 1, 2006. ACM (New York, NY, USA), 2006: full paper, article No. 20 ISBN: 1-59593-428-6, Scopus: 2-s2.0-34547411435, DOI: 10.1145/1146847.1146867

@inproceedings{Batko:2006:SSS:1146847.1146867,
 author = {Batko, Michal and Novak, David and Falchi, Fabrizio and Zezula, Pavel},
 title = {On Scalability of the Similarity Search in the World of Peers},
 booktitle = {Proceedings of the 1st International Conference on Scalable Information Systems},
 series = {InfoScale '06},
 year = {2006},
 isbn = {1-59593-428-6},
 location = {Hong Kong},
 articleno = {20},
 url = {http://doi.acm.org/10.1145/1146847.1146867},
 doi = {10.1145/1146847.1146867},
 acmid = {1146867},
 publisher = {ACM},
 address = {New York, NY, USA},
}

Using MILOS to build an on-line photo album: the PhotoBook

G. Amato, P. Bolettieri, F. Debole, F. Falchi, F. Rabitti, P. Savino
In SEBD 2006: Fourteenth Italian Symposium on Database Systems – (Portonovo, Italy, June 18-21, 2006): full paper, pp. 233-240. ISBN: 88-6068-018-2. Scopus: 2-s2.0-84893266139,

@INPROCEEDINGS{2006-SEBD-Falchi,
 author = {G. Amato and P. Bolettieri and F. Debole and F. Falchi and F. Rabitti and P. Savino},
 title = {CUsing MILOS to build an on-line photo album: the PhotoBook},
 booktitle = {In Proceedings of SEDB ’06, the 14th Italian Symposium on Advanced Database Systems},
 year = {2006},
 pages = {232-240}
}

Selection of MPEG-7 image features for improving image similarity search on specific data sets

P.L. Stanchev, G. Amato, F. Falchi, C. Gennaro, F. Rabitti, and P. Savino
Proceedings of the Seventh IASTED International Conference on Computer Graphics and Imaging (CGIM 2004), August 17-19, 2004, Kauai, Hawaii, USA, pp. 395-400 International Association of Science and Technology for Development – IASTED Acta Press ISBN: 0-88986-418-7, ISSN: 1482-7905, WOS: 000228521000067, Scopus: 2-s2.0-10444268921,

@INPROCEEDINGS{Stanchev04selectionof,
 author = {Peter L. Stanchev},
 title = {Selection of mpeg-7 image features for improving image similarity search on specific data sets},
 booktitle = {In Proc. 7-th IASTED Int’l Conf. on Computer Graphics and Imaging},
 year = {2004},
 pages = {395--400}
}

Improving image similarity search effectiveness in a multimedia content management system

G. Amato, F. Falchi, C. Gennaro, F. Rabitti, P. Savino, P. Stanchev
In MIS 2004: proceedings of the 10th Workshop on Multimedia Information Systems, August 2004: pp. 139-146.

@INPROCEEDINGS{Falchi_improvingimage,
 author = {Fabrizio Falchi and Claudio Gennaro and Fausto Rabitti and Pasquale Savino and Peter L. Stanchev},
 title = {Improving image similarity search effectiveness in a multimedia content management system},
 booktitle = {in Proc. of Workshop on Multimedia Information System (MIS), 2004},
 year = {},
 pages = {139--146}
}


PhD Thesis

A Content-Addressable Network for Similarity Search in Metric Spaces

F. Falchi

Joint PhD Università degli Studi di Pisa, Ingegneria dell'Informazione and Masaryk University, Faculty of Informatics, Brno.

Because of the ongoing digital data explosion, more advanced search paradigms than the traditional exact match are needed for contentbased retrieval in huge and ever growing collections of data produced in application areas such as multimedia, molecular biology, marketing, computer-aided design and purchasing assistance. As the variety of data types is fast going towards creating a database utilized by people, the computer systems must be able to model human fundamental reasoning paradigms, which are naturally based on similarity. The ability to perceive similarities is crucial for recognition, classification, and learning, and it plays an important role in scientific discovery and creativity. Recently, the mathematical notion of metric space has become a useful abstraction of similarity and many similarity search indexes have been developed. In this thesis, we accept the metric space similarity paradigm and concentrate on the scalability issues. By exploiting computer networks and applying the Peer-to-Peer communication paradigms, we build a structured network of computers able to process similarity queries in parallel. Since no centralized entities are used, such architectures are fully scalable. Specifically, we propose a Peer-to-Peer system for similarity search in metric spaces called Metric Content-Addressable Network (MCAN) which is an extension of the well known Content-Addressable Network (CAN) used for hash lookup. A prototype implementation of MCAN was tested on real-life datasets of image features, protein symbols, and text — observed results are reported. We also compared the performance of MCAN with three other, recently proposed, distributed data structures for similarity search in metric spaces.

@phdthesis{FalchiPhDThesis,
 author = "Fabrizio Falchi",
 title = "A Content-Addressable Network for Similarity Search in Metric Spaces",
 school "University of Pisa, Ingegneria dell'Informazione and Masaryk University, Faculty of Informatics, Brno",
 supervisor = "Lopriore, Lanfranco and Zezula, Pavel and Rabitti, Fausto",
 year = 2007,
 month = 5,
}


Project Deliverables

Strumenti per la classificazione ed annotazione automatica delle immagini
Editor: F. Falchi, Authors: G. Amato, F. Falchi, P. Bolettieri
VISITO Tuscany, A4.2, 31 May 2011

Sviluppo componente per la ricerca efficiente di immagini
Editor: F. Falchi, Authors: G. Amato, P. Bolettieri, F. Falchi, C. Gennaro
VISITO Tuscany, A4.3, 28 February 2011

Sviluppo componente per il matching approssimato di immagini
Editor: F. Falchi, Authors: G. Amato, F. Falchi, P. Bolettieri
VISITO Tuscany, A4.1, 30 September 2010

Lo stato dell'arte: tecnologia ed utenti
Editor: F. Falchi, Authors: F. Falchi, V. Ippolito, D. Loschiavo, C. Lucchese, F. Lungarotti, A. Melani, S. Minelli, S. Pialli, S. Rossi, S. Salvadori, R. Scartoni, R. Scopigno, F. Tavanti, F. la Torre, R. Venturini
VISITO Tuscany, A1.1.1, 23 February 2010, A1.1.2, 30 September 2010, A1.1.3, 31 May 2011

The ASSETS API
C. Meghini, F. Alberto Cardillo, A. Esuli, F. Falchi, D. Ceccarelli, P. Bolettieri, N. Aloia, C. Concordia, V. Valdés, F. López, J.M. Martínez, J. Bescós, P. Castells, M.A. García, O. Paytuvi, M. Lazaridis, A. Beloued, N. Spyratos, T. Sugibuchi
ASSETS (Advanced Search Services and Enhanced Technological Solutions for the European Digital Library), D.2.0.4

Interface Specifications and System Design
L. Briguglio, S. Gordea, A. Lindley, E. Tzoannos, C. Meghini, F.A. Cardillo, A. Esuli, F. Falchi, D. Ceccarelli, P. Bolettieri, N. Aloia, C. Concordia, V. Valdes, O. Paytuvi, M. Lazaridis, A. Beloued, N. Spyratos, T. Sugibuchi
ASSETS (Advanced Search Services and Enhanced Technological Solutions for the European Digital Library), D.2.0.2

Executing complex similarity queries over multi layer P2P search structures
Editor: F. Falchi Authors: M. Batko, F. Falchi
SAPIR (Search In Audio Visual Content Using Peer-to-Peer IR), D.5.4

Design of the P2P Similarity Based Indexing Technique PDF
Editor: Fabrizio Falchi, Raffaele Perego Authors: M. Batko, F. Falchi, R. Perego, P. Zezula
SAPIR (Search In Audio Visual Content Using Peer-to-Peer IR), D.4.1

Common Schema for Feature Extraction PDF
Editor: A. Kaplan, Authors: A. Kaplan, W. Allasia, F. Falchi, F. Gallo, C. Hagège, J. Mamou, Y. Mass, R. Miotto, N. Orio
SAPIR (Search In Audio Visual Content Using Peer-to-Peer IR), D.3.1

Feature Extraction Modules for Audio, Video, Music, and Text PDF
Editor: A. Kaplan, Authors: P. Bolettieri, F. Falchi, C. Lucchese, W. Allasia, F. Gallo, J. Mamou, B. Sznajder, R. Miotto, N. Orio, C. Brun, J.M. Coursimault, C. Hagège, A. Kaplan
SAPIR (Search In Audio Visual Content Using Peer-to-Peer IR), D.3.2 D3.3 D3.4 D3.5

Design of Techniques for Caching and Replicas Management on P2P PDF
Editor: C. Lucchese, Authors: C. Lucchese, R. Perego, M. Kacimi, S. Orlando, F. Falchi
SAPIR (Search In Audio Visual Content Using Peer-to-Peer IR), D.4.3

State of the art of current P2P and ontology languages initiatives
Editor: A. Maurino, Authors: D. Beneventano, C. Aiello, … , F. Falchi, et al.
NeP4B (Networked Peers For Business), D2.1.1

Prototypes for building the semantic peer - First release
Editor: M. Vincini, Authors: L. Po, F. Guerra, T. Fagni, F. Flachi, M. Rosini, D. Cerizza, F. Corcoglioniti, M. Mordacchini
NeP4B (Networked Peers For Business), D3.2.1

Prototypes for building the semantic peer - Final release
Editor: M. Vincini, Authors: L. Po, F. Guerra, T. Fagni, F. Flachi, M. Rosini, D. Cerizza, F. Corcoglioniti, M. Mordacchini
NeP4B (Networked Peers For Business), D3.2.1



Technical Reports




Complete List

Education

Master's degree in Computer Engineering
from the University of Pisa, Italy.

Ph.D. in Information Engineering
from Information Engineering Department of the University of Pisa, Italy.

Ph.D. in Informatics
from the Faculty of Informatics of the Masaryk University of Brno, Czech Republic.

MBA in “Innovation Management & Services Engineering”
from Scuola Superiore Sant'Anna, Pisa, Italy.

Piano Degree
under the guide of Prof.ssa Alma Cheli Quartaroli - Siena, 1995.

Music Composition “Compimento Inferiore”
under the guide of Prof. Andrea Nicoli at the Conservatory “G. Puccini” - La Spezia

Attended the “Third DELOS International Summer School on Digital Library Technologies” (ISDL 2004) and the “Scuola Nazionale dei Dottorati di Ricerca in Ingegneria Informatica” (National School of Information Engineering PhD Students).

Awards and Grants

Competitions:

Best Papers:

Grants:

Projects

Ongoing:

FAIR - Future Artificial Intelligence Research is the Italian project approved in the call ‘Extended Partnerships: Artificial Intelligence. Fundamental aspects’ of the NRRP, defined on one of the hottest topics of current Information Technology, i.e. the theoretical, modelling and engineering aspects of modern Artificial Intelligence, strongly desired by the Italian Artificial Intelligence community to study and develop the methodological foundations, architectural models, theory and practice of AI.
SERICS - SEcurity and RIghts In the CyberSpace, financed followingthe participation in the Public Notice "for the presentation of Proposals for the creation of "Partnerships extended to universities research centres, companies for the funding of basic research projects" - as part of the National Recovery and Resilience Plan
THE - Tuscany Health Ecosystem, the only life sciences innovation ecosystem funded under the PNRR.
AI4Media - A Centre of Excellence delivering next generation AI Research and Training at the service of Media, Society and Democracy.
Tuscany X.0 - Tuscany EU Digital Innovation Hub.

Ended:

AI4EU is the first European Artificial Intelligence On-Demand Platform and Ecosystem with the support of the European Commission under the H2020 programme.
NAUSICAA - “NAUtical Safety by means of Integrated Computer-Assistance Appliances 4.0.
The NAUSICAA 4.0 projects aim at creating a system for medium and large boats in which the conventional control, propulsion, and thrust systems are integrated with a series of latest generation sensors such as lidar systems, cameras, radar, marine drones, and aircraft in order to allow the preparation of a complementary assistance system during the navigation and mooring phases.
WeAreClouds@Lucca develops an information system based on the network of pre-existing cameras, capable of monitoring the main public places and access points to evaluate the presence of people in real-time and statistically over time and, without affecting individual privacy, also provide information on the age and gender of individuals. .
AI4ChSites - Intelligenza Artificiale per il Monitoraggio Visuale dei Siti Culturali.
ViDEMo (Visual Deep Engines for Monitoring, scientific director Fabrizio Falchi) takes inspiration from recent advances in machine learning technologies and in particular from representation learning methods based on multiple and hierarchical representations (deep learning). These advances entail the development of new services, unthinkable until a few years ago, in the context of visual analysis. The project aims to study and further advance the state of the art of image content analysis techniques for the automatic extraction of information that allows searching through similarity, visual navigation and the recognition of objects and faces.
SmartNews (Social Sensing for Breaking News) aims at developing a tool able to support journalists in the whole process consisting of detecting breaking news, collecting relevant information about them and writing articles. The tool will "listen" to social media and will be able to automatically locate the breaking news.
In the smart cities context, in the second half of 2013 CNR launched a project entitled Renewable Energy and ICT for Energy Sustainability (Energia da Fonti Rinnovabili e ICT per la Sostenibilità Energetica). The project was based on the widespread use of renewable energy sources (and related storage technologies and energy management) and the extensive use of ICT technologies for an enhanced management of the energy flows, thus making the energy services more efficient by adapting them to the demand (and, therefore, encouraging the energy saving and the energy rational use), with the informed involvement of citizens. One group of researchers in the CNR area of Pisa was involved in a part of this wide project. However, as already said, a smart Energy diffusion and management is only one aspect, among many others, of a smart city, and the CNR area in Pisa (the largest CNR area in Italy) is, in fact, a small city where smart technologies and applications can be experimented before to be reversed in a smart city.
Presto4U was a two-year project supported by a core network of 14 PrestoCentre members. The project aimed to identify useful results of research into digital audiovisual preservation and to raise awareness and improve the adoption of these both by technology and service providers as well as media owners. Fabrizio Falchi has been leader of the Research and Scientific Collections Community of Practice.
VISITO Tuscany (VIsual Support to Interactive TOurism in Tuscany) investigate and realize technologies able to offer an interactive and customized advanced tour guide service to visit the cities of art in Tuscany.
EAGLE, The Europeana network of Ancient Greek and Latin Epigraphy is a best-practice network co-funded by the European Commission, under its Information and Communication Technologies Policy Support Programme. EAGLE will provide a single user-friendly portal to the inscriptions of the Ancient World, a massive resource for both the curious and for the scholarly.
The European project SAPIR (Search on Audio-visual content using Peer-to-peer Information Retrieval ) developed a largescale, distributed Peer-to-Peer infrastructure that will make it possible to search for audio-visual content by querying the specific characteristics (i.e., features) of the content. SAPIR’s goal is to establish a giant Peer-to-Peer network, where users are peers that produce audiovisual content using multiple devices (e.g., cell phones) and service providers will use more powerful peers that maintain indexes and provide search capabilities
Advanced Service Search and Enhancing Technological Solutions for the European Digital Library is a 2 year project co-funded by the CIP Policy Support Programme which aims to improve the usability of Europeana by developing, implementing and deploying software services focused on search, browsing and interfaces. ASSETS strives also to make more digital items available on Europeana by involving content providers across different cultural environments.
MObility and Tourism in Urban Scenarios is a platform of services able to gather, aggregate and interpret in real time urban mobility data from different infrastructures scattered across urban areas and in historic cities. The main objective of the project is to improve the management, sustainability and environmental compatibility of urban mobility. Co-funded by Ministry of Economic Development in the framework of Industria 2015 Programme, MOTUS gives solutions to citizens and tourists needs in cities of artistic interest and tourist sites or in urban scenarios.
Networked Peers for Business, a scalable and flexible framework to provide advanced enterprise interoperation in a common business environment. This is based on a peer-to-peer (P2P) data-driven SWS network for B2B applications. Firms are free to join and leave the network at any time, to act both as a providers of their own services and consumers, and to classify their own profiles, offers, services and other features to gain public visibility to potential customers and partners.
Teaching & Talks

University Courses in Masters (Lauree Magistrali)

University Courses in 2nd Level Masters

Summer/Winter Schools Courses

PhD Tutor

Master Thesis Tutor

Tutorials

Invited Talks

Panels

Talks

Fabrizio presented scientific results in the follolwing events:

Relazioni attività

Relazioni attività: 2011-2014, 2012, 2013, 2014, 2015, 2016, 2017, 2018.
Tools

Datasets

The main goal of this Twitter Dataset is to support the research on deepfake social media text detection in a "real-setting". Other than evaluating the general accuracy of your deep-fake text detector, more specific evaluations can be carried on too: each sample of this dataset is labelled with the corresponding text generation technique ('human', 'GPT-2', 'RNN', 'Others'), therefore you can understand how your detector behaves w.r.t. each generative method.
Virtual World Personal Protection Equipment dataset (VW-PPE) is A dataset for training and testing techniques for Personal Protection Equipment recognition. It includes both images coming from virtual worlds and real world.
ViPeD is a synthetically generated set of pedestrian scenarios extracted from a realistic 3D video game where the labels are automatically generated exploiting 2D pedestrian positions. It extends the JTA (Joint Track Auto) dataset, adding real-world camera lens effects and precise bounding box annotations useful for pedestrian detection.
YFCC100M-HNfc6 is a deep features dataset extracted from the Yahoo Flickr Creative Commons 100M (YFCC100M) dataset created in 2014 as part of the Yahoo Webscope program. The dataset consists of approximately 99.2 million photos and 0.8 million videos, all uploaded to Flickr between 2004 and 2014 and published under a Creative Commons commercial or non commercial license.
3 million tweets (text and associated images) labeled according to the sentiment polarity of the text (positive, neutral and negative sentiment) predicted by a tandem LSTM-SVM architecture, obtaining a labeled set of tweets and images divided in 3 categories we called T4SA. We removed near-duplicate images and we selected a balanced subset of images, named B-T4SA, that we used to train our visual classifiers.
@INPROCEEDINGS{8265255,
author={L. Vadicamo and F. Carrara and A. Cimino and S. Cresci and F. Dell'Orletta and F. Falchi and M. Tesconi},
booktitle={2017 IEEE International Conference on Computer Vision Workshops (ICCVW)},
title={Cross-Media Learning for Image Sentiment Analysis in the Wild},
year={2017},
volume={},
number={},
pages={308-317},
keywords={convolution;data visualisation;feedforward neural nets;image classification;learning (artificial intelligence);sentiment analysis;social networking (online);Tweets;cross-media learning;deep convolutional neural network training;image content;image sentiment analysis;multimedia content;social media;textual data;visual sentiment analysis;visual sentiment classifier;Feature extraction;Media;Sentiment analysis;Support vector machines;Twitter;Visualization},
doi={10.1109/ICCVW.2017.45},
ISSN={},
month={Oct},} 
CNRPark is a benchmark of about 12,000 images of 250 parking spaces collected in different days, from 2 distinct cameras, which were placed to have different perspectives and angles of view, various light conditions and several occlusion patterns. We built a mask for each parking space in order to segment the original screenshots in several patches, one for each parking space. Each of these patches is a square of size proportional to the distance from the camera, the nearest are bigger then the farthest. We then labelled all the patches according to the occupancy status of the corresponding parking space.
@INPROCEEDINGS{7543901,
author={G. Amato and F. Carrara and F. Falchi and C. Gennaro and C. Vairo},
booktitle={2016 IEEE Symposium on Computers and Communication (ISCC)},
title={Car parking occupancy detection using smart camera networks and Deep Learning},
year={2016},
pages={1212-1217},
doi={10.1109/ISCC.2016.7543901},
isbn = {978-1-5090-0679-3},
publisher = {{IEEE} Computer Society}
}
CNRPark-Ext is a dataset of roughly 150,000 labeled images of vacant and occupied parking spaces, built on a parking lot of 164 parking spaces. CNRPark-EXT includes and significantly extends CNRPark.
@article{AMATO2017327,
title = "Deep learning for decentralized parking lot occupancy detection",
journal = "Expert Systems with Applications",
volume = "72",
number = "",
pages = "327 - 334",
year = "2017",
note = "",
issn = "0957-4174",
doi = "http://dx.doi.org/10.1016/j.eswa.2016.10.055",
url = "http://www.sciencedirect.com/science/article/pii/S095741741630598X",
author = "Giuseppe Amato and Fabio Carrara and Fabrizio Falchi and Claudio Gennaro and Carlo Meghini and Claudio Vairo",
keywords = "Machine learning",
keywords = "Classification",
keywords = "Deep learning",
keywords = "Convolutional neural networks",
keywords = "Parking space dataset"
}
A collection of 100 million images, with the corresponding descriptive features, to be used in experimenting new scalable techniques for similarity searching, and comparing their results. In the context of the SAPIR (Search on Audio-visual content using Peer-to-peer Information Retrieval) European project, we had to experiment our distributed similarity searching technology on a realistic data set. Therefore, since no large-scale collection was available for research purposes, we had to tackle the non-trivial process of image crawling and descriptive feature extraction (we used five MPEG-7 features) using the European EGEE computer GRID.
@article{DBLP:journals/corr/abs-0905-4627,
 author = {Paolo Bolettieri and
 Andrea Esuli and
 Fabrizio Falchi and
 Claudio Lucchese and
 Raffaele Perego and
 Tommaso Piccioli and
 Fausto Rabitti},
 title = {CoPhIR: a Test Collection for Content-Based Image Retrieval},
 journal = {CoRR},
 volume = {abs/0905.4627},
 year = {2009},
 url = {http://arxiv.org/abs/0905.4627},
 timestamp = {Wed, 07 Jun 2017 14:40:13 +0200},
 biburl = {http://dblp.uni-trier.de/rec/bib/journals/corr/abs-0905-4627},
 bibsource = {dblp computer science bibliography, http://dblp.org}
}
Pisa Dataset

Code

VIR
VIR (Visual Information Retrieval) is a library for content based image retrieval and classification based on global and local features. The library allows comparing images considering their global and/or local features. It includes local features matching, RANSAC, MPEG-7 global features comparisons, kNN classification, Bag-of-Words (or Bag-of-Features) approach. It is an ongoing project.
Services

General Co-chair of Ital-IA 2023, 15th International Conference on Similarity Search and Applications, Ital-IA 2023

Technical Program Committee Chair of SISAP 2022, 15th International Conference on Similarity Search and Applications, SISAP 2022

Grand Challenge Chair of ICMR 2022, ACM International Conference on Multimedia Retrieval

Local Chair of SEBD 2022, 30th Italian Symposium on Advanced Database Systems

Panel Chair of MMM 2021, 27th International Conference on Multimedia Modeling (MMM 2021), June 22-24, 2021, virtual

PhD Session Chair of SEBD 2019, 27th Italian Symposium on Advanced Database Systems

Sponsorship co-chair of SIGIR 2016, 39th International ACM SIGIR Conference on RR&D in Information Retrieval

Publications chair of the 8th International Conference on Similarity Search and Applications (SISAP 2015)

Chair of the track Engineering Large-Scale Distributed Systems (ELSDS) at SAC 2008, the 23rd Annual ACM Symposium on Applied Computing (Vila Galé in Fortaleza, Ceará, Brazil - March 16 - 20, 2008).

Chair of the CHORUS First workshop on peer to peer architectures for multimedia retrieval (1P2P4mm), co-located with INFOSCALE 2008.

Publicity Chair of INFOSCALE 2008, the Third International Conference on Scalable Information Systems.

Program Committee Member of:

Has served as reviewer for the following journals:

Contact Me

Pisa, Italy

+39 050 315 2911


Thanks to my family, parents, grand mother Gemma and all my ancestors.

Powered by w3.css