Methodology and Experiment

Flowers in Chania

            Click for more details below


Tokenization/Parsing

Remove stopwords, lemmatization, stemming

Search System and techniques

BM25, Demographic Filtering. Relevance Boosting with Medical NER

Re-ranking System

MonoBERT, DuoBERT

Adhoc Query Generation with T5 model

SIGIR query pairs for training, result boosting

Tokenization/Parsing

Depending on the specific structure of the documents, we perform data preprocessing steps including removing stopwords, lemmatization, and stemming.



Search System



Re-ranking System



Adhoc Query Generation with T5 model

A Text-to-Text Transfer Transformer (T5)-base model is finetuned for query generation. We trained the model on SIGIR (description, ad-hoc) query pairs.

We found that the results often contain excerpts from the description. In addition, we boosted the results for TREC but Synthetic queries performs worse for SIGIR.