Efficient Algorithms for Medical Image SegmentationSupervised by Arlindo L. Oliveira and authored by José
MartinhoWith the growth in cancer cases and the increasing
expenditures in the healthcare system, it is necessary to
automate processes, aiming for a faster diagnostic and
decrease in expenses. Although current technologies enable to
capture high-resolution 3D images of organs, manual
segmentation of organs and tumours is still a complex process
that requires high expertise. State-of-the-art algorithms are
already very accurate. However, they are very
compute-intensive tasks, leading to the need for expensive
hardware and energy wasting. Coupling state-of-the-art
efficient feature extraction algorithms to the nnUNet
segmentation framework, this work proposes novel efficient
architectures for medical image segmentation. For some tasks,
similar results were achieved using around 30% less Floating
Point Operations (FLOPs) than the baseline nnUNet, also
decreasing the inference time. Morevover, a better performance
then nnUNet was achieved using architectures with slightly
longer inference time.
Towards Improving Ischemic Stroke Functional Outcome
Prediction with Computed Tomography Brain Scans Using Deep
LearningSupervised by Arlindo L. Oliveira and Catarina Fonseca and
authored by Gonçalo OliveiraStroke is the second leading cause of death and disability,
of all the non transmissible diseases, in the world. Quick
diagnosis and prognosis is of paramount importance given the
rapid degradation of the affected brain and short time frame
available for the recommended treatments. The collection of
computed tomography brain scans is part of the standard
patient care. However, their examination is manually done by
experts. Also, despite containing strong patient functional
outcome predictor features, these are rarely considered by the
currently used clinical models that mostly only use
demographic and clinical patient variables. This work explores
three different approaches to improve on these models,
obtaining results comparable to the state of the art in their
respective categories. In the tabular approach, machine
learning classifiers use the same type of variables used by
the clinical models to predict the functional outcome. In the
imaging approach, the outcome is directly predicted solely
from the patient's brain scans, using deep artificial neural
networks. Here several architectures never before tried in
this task are explored, including multiple instance learning
models and Siamese networks that leverage a useful brain
hemisphere symmetry bias. Finally, in the hybrid approach,
both important clinical features and imaging information are
leveraged and combined in a simpler and more interpretable
manner than that of existing models.
Transaction-Based Entity Monitoring in a Client Due Diligence
ContextSupervised by Arlindo L. Oliveira and Jacopo Bono and
authored by Oleksandr StopchakCurrent solutions to Anti-money Laundering encompass three
different components, namely, transaction monitoring,
screening and Customer Due Diligence. These have been mainly
based on rule systems and human analysts, which can lead to
many false positive alerts and a large load on human
resources. In this work, we explore a novel approach to aid
CDD. To do this, we propose the usage of machine learning
methods to calculate an entity’s risk based on its
transactional behavior by leveraging historical transactions
to generate a Risk Score. First we summarize the transaction
behavior into an embedding using feature engineering. Then we
calculate a risk score that quantifies the dissimilarity of an
entity’s behavior to what is expected using Anomaly Detection
techniques. Finally, with the use of explainability techniques
we clarify the assigned risk score by showing the specifics of
an entity’s behavior that contributed to the final assessment
of our approach. With our proposed method, we can reduce the
burden on human analysts by 1) using machine-learning based
techniques that can identify incorrectly classified, and
therefore, potentially illicit entities by comparing their
transactional behavior to other entities with the same label;
and 2) generating a report with information that can provide a
reasonable explanation for an assigned RS in the form of
visualizations.
Pretraining the Vision Transformer using self-supervised
methods for vision-based deep reinforcement learningSupervised by Arlindo L. Oliveira and authored by Manuel
GoulãoThe Vision Transformer architecture has shown to be
competitive in the computer vision (CV) space where it has
dethroned convolution-based networks in several benchmarks.
Nevertheless, Convolutional Neural Networks (CNN) remain the
preferential architecture for the representation module in
Reinforcement Learning. In this work, we study pretraining a
Vision Transformer using several state-of-the-art
self-supervised methods and assess data-efficiency gains from
this training framework. We propose a new self-supervised
learning method called TOV-VICReg that extends VICReg to
better capture temporal relations between observations by
adding a temporal order verification task. Furthermore, we
evaluate the resultant encoders with Atari games in a
sample-efficiency regime, procgen games for measuring
generalization and an imitation learning task for a fast and
reliable comparison of the representations. Our
data-efficiency results show that the vision transformer, when
pretrained with TOV-VICReg, outperforms the other
self-supervised methods and the non-pretrained vision
transformer but still struggles to overcome a CNN. Our
generalization results show some limitations in our method
when used in more visually complex games which leads to
degradation of the generalization performance. Nevertheless,
we were able to outperform a CNN in two of the ten Atari games
where we perform a 100k steps evaluation and show a consistent
data-efficiency gain in comparison to the non-pretrained
vision transformer. Ultimately, we believe that such
approaches in Deep Reinforcement Learning (DRL) might be the
key to achieving new levels of performance as seen in natural
language processing and computer vision.
Neural Models for Generating Clinically Accurate Chest X-Ray
ReportsSupervised by Arlindo L. Oliveira and Bruno Martins and
authored by André LeiteImage captioning models have been increasing their
performance comprehensively, having shown that artificial
intelligence can achieve successful results in computer vision
tasks. However, there are still some tasks within the range of
image captioning that need more focus, including the automatic
clinical report generation. The automatic generation of
radiology reports based on radiology images has gathered an
increasing amount of focus in the last few years. This is
supported by the repetitive and exhaustive work that these
clinical reports demand. Artificial neural networks that
address this task have been changing over the years, starting
as convolutional neural networks, changing over to
transformer-based models. However, these existing
methodologies focus more on one of two important aspects, that
being the fluency and human-readability capacity of the
generated text, over the clinical efficiency of the model.
Consequently, in this dissertation we propose a model capable
of achieving competitive results regarding the human
readability of the reports, as well as improving clinical
efficiency. We propose to adapt the MedCLIP model to have an
image-text encoder capable of concatenating both image and
text. We further propose that this model works with the
assistance of an Information Retrieval mechanism, to retrieve
reports resulting on similarity evaluation done on an input
x-ray, obtaining the closest reports. On the MIMIC-CXR
dataset, our model has improved on both natural language
processing metrics and clinical efficiency, over
well-established models. Finally, we further show that our
model can lead to more human-readable reports, while keeping
clinical actuality, over most state-of-the-art models.
Old photo and image restoration using deep learning
techniquesSupervised by Arlindo L. Oliveira and authored by José
PereiraThere are multiple factors that can contribute to the
degradation of an image. The process of recovering such images
to their initial state is called Image Restoration. Nowadays
many deep learning techniques have been proposed that claim to
solve this problem. In this work, I select a few deep learning
models both single (focus only on one type of degradation,
such as super-resolution methods) and mixed degradation (when
tackling all the defects at the same time) achieving
state-of-the-art performance on different restoration tasks
(Deblurring, Denoising, Super-Resolution, etc.), test them on
a synthetically degraded dataset and evaluate them according
to two objective metrics (PSNR and SSIM) as well as
subjectively, through human perception. These are then
combined and compared with the state-of-the-art method in old
photo restoration which comprises an image-to-image
translation framework based on deep latent space translation.
This state-of-the-art approach outperformed all other methods
and combinations of by a large margin.
Using a Siamese Network to Accurately Detect Ischemic Stroke
in Computed Tomography ScansSupervised by Arlindo L. Oliveira and Catarina Fonseca
authored by Beatriz VieiraThe diagnosis procedure of stroke, a leading cause of death
in the world, involves the acquisition of images using
computed tomography scans, making possible the assessment of
the severity of the incident and the type and location of the
lesion. The fact that the brain has two hemispheres with a
high level of anatomical similarity, exhibiting significant
symmetry, has led to extensive research based on the
assumption that a decrease in symmetry is directly related to
the presence of pathologies. This work is focused on the
analysis of the symmetry (or lack of it) of the two brain
hemispheres, and on the use of this information for the
classification of computed tomography brain scans of stroke
patients. The objective is to contribute to the process of
automatic identification of brain lesions caused by stroke
events. To perform this task, we used the Siamese Network
architecture, which uses two parallel neural networks that
share the same weights. The composed network receives a double
image (the original image and the mirrored one) and a label
that reflects the existence or not of stroke. The network then
extracts the relevant features and classifies the images
taking into account their similarity. The resulting network
can be used to classify unseen scans, depending on the
perceived level of symmetry into one of two existing classes:
evidence of stroke or absence of stroke. The accuracy of the
proposed method is approximately 72%, significantly
outperforming a standard convolutional network architecture,
which was used as a baseline.
Using a Siamese Network to Accurately Detect Ischemic Stroke
in Computed Tomography ScansSupervised by Arlindo L. Oliveira and Catarina Fonseca
authored by Beatriz VieiraThe diagnosis procedure of stroke, a leading cause of death
in the world, involves the acquisition of images using
computed tomography scans, making possible the assessment of
the severity of the incident and the type and location of the
lesion. The fact that the brain has two hemispheres with a
high level of anatomical similarity, exhibiting significant
symmetry, has led to extensive research based on the
assumption that a decrease in symmetry is directly related to
the presence of pathologies. This work is focused on the
analysis of the symmetry (or lack of it) of the two brain
hemispheres, and on the use of this information for the
classification of computed tomography brain scans of stroke
patients. The objective is to contribute to the process of
automatic identification of brain lesions caused by stroke
events. To perform this task, we used the Siamese Network
architecture, which uses two parallel neural networks that
share the same weights. The composed network receives a double
image (the original image and the mirrored one) and a label
that reflects the existence or not of stroke. The network then
extracts the relevant features and classifies the images
taking into account their similarity. The resulting network
can be used to classify unseen scans, depending on the
perceived level of symmetry into one of two existing classes:
evidence of stroke or absence of stroke. The accuracy of the
proposed method is approximately 72%, significantly
outperforming a standard convolutional network architecture,
which was used as a baseline.
Siamese Transformer Networks for Improving Address
MatchingSupervised by Arlindo L. Oliveira and authored by André
DuarteAddress matching plays a very important role on the daily
activities of post offices and companies responsible for
processing and delivering packages. Address matching is a
subtask of geocoding, and consists in pairing addresses, from
multiple databases, that refer to the same place. Geocoding
aims to assign physical coordinates (latitude and longitude)
to an address so that the routes performed by the delivery-man
can be planned accurately. Errors in the address matching are
quite harmful to this type of companies, in economic,
environmental, or reputational terms. There are several
methodological approaches to perform address matching. Some
methodologies involve doing a standardization of the address
or even parsing the elements of the address, to then perform
elementwise matching. These methodologies are not perfect and
end up needing the manual correction of the address by a
human. This dissertation contributes to the solution of this
problem by presenting a model that executes with success and
efficiency, the task of pairing Portuguese addresses. The
proposed solution fits in the Deep Learning field and has its
main focus on Siamese Neural Networks of Pre-Trained
Transformers. In this field, there are already promising
results for similar tasks, which prove the viability of Deep
Learning models for solving this kind of problem. The obtained
results on a real address matching task proved that the
proposed solution is a promising approach. The model is able
to map the addresses in this dataset with an accuracy never
lower than 94% on Artery level and 90% on Door level.
A Modular Architecture for Model-Based Deep Reinforcement
LearningSupervised by Arlindo L. Oliveira and authored by Tiago João
Gaspar Ribeiro de OliveiraThe model-based reinforcement learning (MBRL) paradigm, which
uses planning algorithms, has recently achieved unprecedented
results in the area of DRL. These agents are quite complex and
involve multiple components, factors that can create
challenges for research. In this work, we propose a modular
software architecture (our implementation can be found in
https://github.com/GaspTO/Modular_MBRL) suited for these types
of agents, which makes possible the implementation of
different algorithms and for each component to be easily
configured (such as different exploration policies, search
algorithms...). We illustrate the use of this architecture by
implementing several algorithms and experimenting with agents
created using different combinations of these. We also suggest
a new simple search algorithm called averaged minimax that
achieved good results in this work. Our experiments also show
that the best algorithm combination is
problem-dependent.
Improving the Performance of Deep Neural Networks in Vision
Tasks with Attention Mechanisms.Supervised by Arlindo L. Oliveira and authored by Rafael
Gamanho PedroThere is no single precise definition of "attention" for
neural networks. Broadly, attention mechanisms are neural
network layers that aggregate information from the entire
input data. They do so according to the specific problem
addressed, as it depends on the input data, such as a phrase
or an image. This work will focus on the computer vision task.
Attention mechanisms have gained traction in natural language
processing, yet their use in computer vision has been on the
rise for a few years. This usage of attention mechanisms is
somewhat recent and has been advancing quickly, with new
architectures published often. This thesis aims to study and
compare different attention mechanisms to improve the
performance in image classification tasks. Three use cases
related to medical imaging will be used to ensure the benefits
attention mechanisms bring to real-world scenarios. The
results show that there are scenarios where attention
mechanisms improve the performance on medical datasets.
However, the performance increase was not as consistent as
expected. The experiments also show that attention mechanisms
need more data than their conventional counterparts.
Using Knowledge Graphs to model Digital FootprintsSupervised by Arlindo L. Oliveira and authored by André
Carlos Ruano Andrade CavalheiroThe new generations are born into a world where the internet
is a natural extension of the real world. The online logs
created throughout their lives might remain long after they
are gone - detailed information about their everyday activity.
Currently, corporations use this data to predict short-term
actions in order to maximize the use of their services, which
is but one of many use-cases that such an opportunity
presents. Knowledge graphs, which have been the target of
intensive research in recent years, were used in this work to
model personal data. The project aims to create a framework
for centralizing a person's logs originating from multiple
sources on the web. Specifically, this work makes the
following contributions: 1. Developed a framework to store a
person's records into a usable and interpretable structure,
providing a review of its possibilities and limitations with
hopes of guiding future research. 2. Created a proof of
concept made from a single user's data downloaded from five of
the most widely used online platforms. 3. Performed
experiments using pre-established models based on the concepts
of metapaths, to explore the interactions between entities in
the network and explore its semantic and structural
value.
Evaluating generalization in Deep Reinforcement Learning with
Procedural Generated EnvironmentsSupervised by Arlindo L. Oliveira and authored by Miguel
Borges FreireDeep Reinforcement Learning agents, mainly those who learn
from visual observations, often fail to transfer their
knowledge to unseen environments. In games, standard Deep
Reinforcement Learning protocols commonly promote testing in
the same set of levels used in training. This practice leads
an agent to easily overfit a given training set, failing to
transfer its knowledge to out of distribution levels. To
overcome this problem, we construct two separate training and
test sets using procedurally generated environments from the
Procgen Benchmark. We use this benchmark to measure the extent
of overfitting and systematically study the effects of using
regularization and data augmentation methods on the capacity
of the agent to generalize. We found that, in general, using
regularization and data augmentation improves generalization,
with an efficacy that is dependent on the environment's
dynamics. Furthermore, we study how network architectural
decisions such as the depth and the width of the convolutional
network, the usage of pooling layers, skip-connections, and
modifications of the classification layer affect
generalization. Finally, we empirically demonstrate that
convolutional neural networks with small kernels in the early
convolutional layers can accomplish the same generalization
level as a deeper residual model.
Combining off and on-policy training in Deep
ReinforcementSupervised by Arlindo L. Oliveira and authored by Alexandre
João Gomes BorgesMuZero is able to master both Atari games and board games by
learning a model of the environment, that is then used with
Monte Carlo Tree Search (MCTS) to decide what move to play in
each position. During tree search, the algorithm simulates
games by exploring several possible moves, and afterwards
picks the action that corresponds to the most promising
trajectory. Even though not all trajectories from these
simulated games are useful, none of them are used for
training. Using these trajectories would provide more data,
more quickly, leading to faster convergence and sample
efficiency. Recent work introduced an off-policy value target
for AlphaZero that uses data from simulated games. Similarly,
in this work, we propose a way to obtain off-policy targets by
using data from simulated games in MuZero. We combine these
off-policy targets with the on-policy targets already used in
MuZero in several ways, and study the impact of these targets
and their combinations in two environments with distinct
characteristics.
Application of Deep Learning Techniques to the Diagnosis of
Medical ImagesSupervised by Arlindo L. Oliveira and authored by Pedro
Miguel Carreto VazDiabetic Retinopathy (DR) is the leading cause of visual
disability worldwide. Although it is highly treatable when
diagnosed in its earlier stages, there is currently a need of
cheaper and more accurate ways to do so. Medical images have
been used in diagnosis for a long time. Recent advancements in
the computer vision field have shown remarkable results
through the use of Convolutional Neural Networks, that have
been able to reach state-of-the-art results in image
segmentation. In this master's thesis, we implemented a V-Net
like architecture in Python and study how image preprocessing
techniques to highlight lesions associated with DR, and
different optimization metrics have an impact on its results.
The results show that the impact of this variables changes
according to the lesion that we try to segment and that the
V-Net is capable of obtaining good results for some of the
segmentation problems.
MyWatson: A system for interactive acess of personal
recordsSupervised by Arlindo L. Oliveira and authored by Pedro
Miguel dos Santos DuarteWith the number of photos people take growing, it’s getting
increasingly difficult for a common person to manage all the
photos in its digital library, and finding a single specific
photo in a large gallery is proving to be a challenge. In this
thesis, the MyWatson system is proposed, a web application
leveraging content-based image retrieval, deep learning, and
clustering, with the objective of solving the image retrieval
problem, focusing on the user. MyWatson is developed on top of
the Django framework, a high-level Python Web framework, and
revolves around automatic tag extraction and a friendly user
interface that allows users to browse their picture gallery
and search for images via query by keyword. MyWatson’s
features include the ability to upload and automatically tag
multiple photos at once using Google’s Cloud Vision API,
detect and group faces according to their similarity by
utilizing a convolution neural network, built on top of Keras
and Tensorflow, as a feature extractor, and a hierarchical
clustering algorithm to generate several groups of clusters.
Besides discussing state-of-the-art techniques, presenting the
utilized APIs and technologies and explaining the system’s
architecture with detail, a heuristic evaluation of the
interface is corroborated by the results of questionnaires
answered by the users. Overall, users manifested interest in
the application and the need for features that help them
achieve a better management of a large collection of
photos.
Automatic Annotation of Unstructured Fields in Medical
DatabasesSupervised by Arlindo L. Oliveira and Maria Luísa Torres
Ribeiro Marques da Silva Coheur. Authored by Margarida Andreia
Rosa CorreiaThe increased use of systems based on Electronic Health
Records caused an enormous increment of information available
electronically, which can be processed by Data Mining
techniques, leading to relevant findings. The expected result
was that this information becomes easy to access, analyze and
share. However, the text present in the clinical notes is
written in natural language, and is, thus, unstructured, and
difficult to automatically process. These clinical notes might
contain pertinent data for the health of the patient. In this
thesis, with the help of Natural Language Processing and
Information Extraction techniques, we present a system that,
given a clinical note, extracts relevant named entities from
it, such as names of diseases, symptoms, treatments, diagnosis
and drugs, generating structured information from unstructured
free text. In addition, in order to avoid privacy issues and
considering that these clinical notes might contain references
to names of patients, doctors or another health professionals,
we also present an anonymization step. Finally, we add a
module that automatically corrects typos from these medical
notes. Final results show that the system, in general, is able
to recognize and interpret medical entities.
Biological Data Processing Using Grid TechnologiesSupervised by Arlindo L. Oliveira and authored by Sérgio
Mendes CostaAt present there is a growing interest in the development of
systems in which scientific analysis with high computing or
data storage and processing requirements can be performed. The
cluster and Grid computing technologies have emerged has the
best support infrastructures for this type of systems.
Biological sciences are among those who have been benefiting
more from the advancement of these technologies, namely in the
study of gene expression mechanisms. In that sense, the
discovery of transcription factor binding sites and the
analysis of gene expression data are particularly relevant. In
the first case, we usually search for short segments of DNA,
known as motifs, that are well conserved. In the second case,
we usually analyze microarray data using data mining
techniques like biclustering. In the context of this thesis,
efficient algorithms for motif inference in gene promoter
regions and for the analysis of gene expression data were made
available in the hermes cluster of Instituto Gulbenkian de
Ciência. The algorithms were developed in the context of the
BioGrid - Parallel Algorithms for Gene Annotation project.
During this work, the necessary tasks of implementing,
installing and testing were performed, as well as the
development of Web interfaces and documentation for every
program. In addition to that, a study was conducted in which
the model-based testing technique was used to evaluate the
software. The algorithms created in the context of the BioGrid
project are now available in a reliable, integrated and
user-friendly system for a large community of Bioinformatics
users.
Modelling and Inference of Gene Regulatory NetworksSupervised by Arlindo L. Oliveira and authored by José Miguel
Ranhada Vellez Caldas
A current problem in biology is how to find adequate models
for the dynamics of gene regulatory networks. Recent
technological advancements allow for the measurement of gene
mRNA levels, in a population of cells, over a period of time.
Given a particular gene regulatory network, time series for
its components, and a parametrizable mathematical model,
optimization algorithms may be used to fit the model's
parameters to the observed dynamics. This is useful for
validating both the hypothetical network and its model, and
for providing new insights about the underlying biological
system. In this thesis I analyze two case studies: the SOS DNA
damage repair network in E. coli and a hypothetical network
for the transcriptional regulation of the gene Flr1's response
to oxidative stress in yeast, induced by the drug Mancozeb.
For the SOS network, I use a known piecewise-linear model and
the parameter inference algorithm BFGS. I compare two
adaptations of piecewise linear models, obtaining a general
form that encompasses both, and I describe a new version of an
optimization algorithm that may be used for inferring
parameters in that model. These results are applied to the
Flr1 network. Both models are used to extract information that
is confirmed by biological literature.
Pathological Analysis of Tissues using Deep Neural
NetworksSupervised by Arlindo L. Oliveira and João Cassis. Authored
by Xavier Abreu DiasPathological images or biopsy images are samples of tissues
from a specific location of a human or animal body.
Pathological analysis is necessary whenever there are any
lesions or any indicative symptoms for a certain disease, like
cancer or the presence of bacteria in tissues. Nowadays, a
large amount of biopsies per day is requested and sent for
analysis by pathologists in order to make a diagnosis. This
process can be difficult, time-consuming, and requires
experience in detecting abnormal tissues. With the advances of
technology, powerful scanners have been developed that have
the ability to amplify 40× and digitize whole slide images,
being able to see at the 250 µm scale. The state-of-the-art
supervised Deep Learning methods applied to slide images
classification or disease detection use mostly deep
annotations (rich annotations), i.e. specific information
where the disease is located if any. This dissertation aims to
contribute with a semi-supervised architecture that enables
models to be built, using Multiple Instance Learning and
Online Hard Example Mining, from weakly-annotated (that inform
whether the whole slide has or not the disease) datasets. The
whole architecture presented in this dissertations consists of
an application of these semi-supervised methods on a deep
architecture with an attention module. The whole architecture
is fit based on a set of biopsy images provided by Hospital da
Luz (Lisbon), whose some instances contain helicobacter
pylori, achieving an accuracy of 91.67% and capturing all
positive ones.
Deep Convolutional Encoder-Decoder Architectures for
Clinically Relevant Coronary Artery SegmentationSupervised by Arlindo L. Oliveira and Mário Alexandre Teles
de Figueiredo. Authored by João Lourenço Coelho da SilvaX-ray coronary angiography is a crucial clinical procedure
for the diagnosis and treatment of coronary artery disease,
which accounts for roughly 16\% of global deaths every year.
However, the images acquired in this procedure have low
resolution and poor contrast, making lesion detection and
assessment challenging. Accurate coronary artery segmentation
not only helps mitigate these problems, but also allows the
extraction of relevant anatomical features for further
analysis by quantitative methods. Although automated
segmentation of coronary arteries has been proposed before,
previous approaches have used non-optimal segmentation
criteria, leading to less useful results. Most methods either
segment only the major vessel, discarding important
information from the remaining ones, or segment the whole
coronary tree, based mostly on contrast information, producing
a noisy output that includes vessels that are not relevant for
diagnostic nor therapeutic purposes. In this work, vessels are
segmented according to their clinical relevance, using a
segmentation criterion developed in collaboration with expert
cardiologists. Additionally, the catheter, whose diameter is
known and provides a scale factor that may be useful for
diagnosis, is segmented simultaneously. To derive the optimal
approach, an extensive comparative study of encoder-decoder
architectures was conducted. Based on the UNet++, a new
computationally efficient and high-performing decoder
architecture is proposed, the EfficientUNet++. Combined with
EfficientNet encoders, the EfficientUNet++ establishes a line
of efficient and high-performing segmentation models, whose
best-performing member achieves a generalized dice score of
0.9202 +/- 0.0356, and artery and catheter class dice scores
of 0.8858 +/- 0.0461 and 0.7627 +/- 0.1812,
respectively.
Automated Assessment of Coronary Artery Stenosis in X-ray
Angiography using Deep Neural NetworksSupervised by Arlindo L. Oliveira and Mário Alexandre Teles
de Figueiredo. Authored by Dinis Lourenço Tavares
RodriguesSeveral methods for quantitative severity assessment of
coronary artery stenosis exist as well as different measures,
leading to distinct management of treatment procedures. It is
of upmost importance to properly identify and classify all
possible stenosis on an individual. A deep-learning three-step
framework implementation was designed to automate the
detection and assessment of stenosis severity. This study
showcases a new clinically obtained dataset of properly
de-identified X-ray invasive coronary angiography (ICA)
sequences of 438 patients from Hospital de Santa Maria. For
each sequence, radio-opaque contrast filled frames were
annotated, defining full stenosis visibility with stenosis
bounding boxes being annotated by an expert physician on
reference frames followed by image processing techniques for
propagation at each frame. Transfer learning dynamics of deep
neural networks are exploited for supervised learning at each
step, employing CNN's for angle view selection of the
Left/Right Coronary Artery (LCA/RCA) achieving 0.97 Accuracy,
single-shot detectors for stenosis detection achieving
0.83/0.81 mAR for LCA/RCA respectively and a new region of
interest boost approach with CNN's for stenosis severity
regression of the RCA was explored. Our method showcases the
importance of transfer learning in stenosis severity
assessment with limited data, achieving considerable
performances. To the best of the author's knowledge, this is
the first time that iFR was used as a metric for stenosis
severity assessment tasks using deep learning
techniques.
Applying Deep Learning to Medical ImagesSupervised by Arlindo L. Oliveira and Mário Alexandre Teles
de Figueiredo. Authored by Ricardo Jorge da Silva DinizDeep convolutional networks have recently been embraced by
the academic community as a competitive solution for visual
recognition tasks. Among these networks, the fully
convolutional neural networks have been gaining traction as
they drop the traditional fully-connected layers of CNNs in
favor of more convolutional layers. The original fully
convolutional network, using layer skipping, was capable of
achieving great results when provided enough samples. This
architecture was extended into the U-Net which outperforms the
FCNN, while being both faster and less computationally
cumbersome than it. Both architectures are designed to work
with 2D input images. However most medical images, such as
ultrasounds and MRIs, are 3D. Built upon the underlying
principles beyond the U-Net and the FCNN, the V-Net was
created. It is a volumetric FCNN which introduces a new
objective function, discards pooling layers in favor of more
convolutional layers and performs residual propagation. V-Nets
have achieved a good performance across all visual recognition
tasks, being comparable to the state-of-the-art solutions
while requiring a fraction of the processing time. In this
thesis several variants of U-Net and V-Net are implemented to,
firstly, attest to their good performance on visual
segmentation tasks of medical data, and, secondly, to assess
how the objective function, kernel’s receptive fields,
residual propagation, activation functions and optimization
method impact the model’s performance. A secondary objective
of this thesis is to bridge the gap between theoretical
knowledge and practical implementations by analyzing Google’s
Tensorflow API, which was designed specifically for
distributed computing based machine learning.
Imputation Techniques for Clinical Data of Ischemic Stroke
PatientsSupervised by Arlindo L. Oliveira and Alexandre Paulo
Lourenço Francisco. Authored by Filipa de Matos MarquesIn the 21st century, every year, approximately 880 thousand
people living in Europe suffer an ischemic stroke. Predicting
the patient’s outcome is key to choosing the course of
treatment. In this master thesis, it was predicted the
functional outcome, by the binary version, of the modified
Rankin Scale at two points in time: three months and one year
after the stroke took place. Often, data provided by health
organisations to conduct these studies is incomplete which can
impair the results. Thus the need arises to choose a proper
way to handle the missing data. Here missing values were
imputed with six different methods and the classifiers were
then trained with seven distinct machine learning models. It
was shown the area under the receiver operating characteristic
curve for the best classifiers, at the three months and
one-year marks, are 0.8217 and 0.7537, respectively. Moreover,
it was not found a statistically significant difference
between the performance of the distinct imputation methods for
each machine learning model.
WeatherIST - iOS application for detailed weather prevision
in Continental PortugalSupervised by Arlindo L. Oliveira and authored by Tiago João
Alves DuarteThis work is the result of a need to move the meteorological
forecast system developed by METEO-IST to an iOS application.
METEO-IST is a weather computational server owned by IST that
calculates with great accuracy the different weather
conditions (rain, wind, humidity, etc.) anywhere within the
Portuguese continental territory. The predictions are
calculated frequently (at every 15 minute intervals) and
exhibit a great precision, distinguishing it from other global
meteorological systems that are currently available. Currently
the system makes the forecasts available through the group's
website. At the request of several users, the objective was to
create a native iOS application for the iPhone that provides
these forecasts by taking advantage of the device’s
capabilities.
Predicting Frequency and Claims of Health Insurance with
Machine Learning techniquesSupervised by Arlindo L. Oliveira and Luís Miguel Veiga Vaz
Caldas de Oliveira. Authored by Pedro Octávio Couto
GonçalvesIn the health insurance industry, policies are typically one
year contracts that are renewed after these twelve months. In
Multicare, this renewal starts to be negotiated at the end of
the first nine months of the current annuity. At this point it
is necessary to set a prediction of how the present annuity
will end, i.e, there is the need to forecast the loss ratio of
the last three months of the annuity considering the loss
ratios of the first nine months. This problem is currently
handled using a time series algorithm, ARIMA, that forecasts
future loss ratios considering only the past ones and ignoring
all other external information that can also prove useful in
predicting the behaviors of the insured population, both in
terms of frequency of usage of the insurance and in terms of
the cost of medical acts. This study incorporates a wide
variety of external variables coming from different sources in
the traditional datasets of Multicare and performs a
comparison between several types of tree-based machine
learning models, aiming to find the ones that lead to better
performances in predicting claims and costs of the insured
population. The main contribution of this work is the proposal
of a new prediction model for the claims and costs of the
insured population of health insurance and its inevitable
comparison with the model that is currently in production in
Multicare, based on ARIMA time series.