PhD Thesis: Ghazal Tashakor: January 26, 2021, 18:00. 

Title: Scalable agent-based model simulation using distributed computing on system biology.

TDX Source:

ne2.pngAbstract: Agent-based modeling is a very useful computational tool to simulate complex behavior using rules at micro and macro scales. This type of modeling’s complexity is in defining the rules that the agents will have to define the structural elements or the static and dynamic behavior patterns. This thesis considers the definition of complex models of biological networks that represent cancer cells obtain behaviors on different scenarios by means of simulation and to know the evolution of the metastatic process for non-expert users of computer systems.

Besides, a proof of concept has been developed to incorporate dynamic network analysis techniques and machine learning in agent-based models based on developing a federated simulation system to improve the decision-making process. For this thesis’s development, the representation of complex biological networks based on graphs has been analyzed, from the simulation point of view, to investigate how to integrate the topology and functions of this type of networks interacting with an agent-based
model. For this purpose, the ABM model has been used as a basis for the construction, grouping, and classification of the network elements representing the structure of a complex and scalable biological network.

The simulation of complex models with multiple scales and multiple agents provides a useful tool for a scientist, non-computer expert to execute a complex parametric model and use it to analyze scenarios or predict variations according to the different patient’s profiles.
The development has focused on an agent-based tumor model that has evolved from a simple and well-known ABM model. The variables and dynamics referenced by the Hallmarks of Cancer have been incorporated into a complex model based on graphs. Based on graphs, this model is used to represent different levels of interaction and dynamics within cells in the evolution of a tumor with different degrees of representations (at the molecular/cellular level).

A simulation environment and workflow have been created to build a complex, scalable network based on a tumor growth scenario. In this environment, dynamic techniques are applied to know the tumor network’s growth using different patterns. The experimentation has been carried out using the simulation environment developed considering the execution of models for different patient profiles, as a sample of its functionality, to calculate parameters of interest for the non-computer expert, such as the evolution of the tumor volume.


PhD Thesis: Elham Shojaei: October 19 2020, 10:30. (Researcher at Leibniz Rechenzentrum, der Bayerischen Akademie der Wissenschaften)

Title: Simulation for Investigating Impact of Dependent and Independent Factors on Emergency Department System Using High Performance Computing and Agent-based Modeling.

TDX Source:

ne2.pngAbstract: Increased life expectancy, and population aging in Spain, along with their corresponding health conditions such as non-communicable diseases (NCDs), have been suggested to contribute to higher demands on the Emergency Department (ED). Spain is one of such countries which an ED is occupied by a very high burden of patients with NCDs. They very often need to access healthcare systems and many of them need to be readmitted even though they are not in an emergency or dangerous situations. Furthermore many NCDs are a consequence of lifestyle choices that can be controllable. Usually, the living conditions of each chronic patient affect health variables and change the quantity of these health variables, so they can change the stability situation of the patients with NCDs to instability and its resultant will be visiting ED.

In this study, a new method for the prediction of future performance and demand in the emergency department (ED) in Spain is presented. Prediction and quantification of the behavior of ED are, however, challenging as ED is one of the most complex
parts of hospitals. Future years of Spain’s ED behavior was predicted by the use of detailed computational approaches integrated with clinical data. First, statistical models were developed to predict how the population and age distribution of patients
with non-communicable diseases change in Spain in future years. Then, an agent-based modeling approach was used for simulation of the emergency department to predict impacts of the changes in population and age distribution of patients with NCDs on the performance of ED, reflected in hospital LoS, between years 2019 and 2039.

Then in another part of this study, we propose a model that helps to analyze the behavior of chronic disease patients with a focus on heart failure patients based on their lifestyle. We consider how living conditions affect the signs and symptoms of chronic disease and, accordingly, how these signs and symptoms affect chronic disease stability. We use an agent-based model, a state machine, and a fuzzy logic system to develop the model. Specifically, we model the required ’living condition’ parameters that can influence the required medical variables. These variables determine the stability class of chronic disease.

This thesis also investigates the impacts of Tele-ED on behavior, time, and efficiency of ED and hospital utilization. Then we propose a model for Tele-ED which delivers the medical services online. Simulation and Agent-based modeling are powerful tools that allow us to model and predict the behavior of ED as a complex system for VIa given set of desired inputs. Each agent based on a set of rules responds to its environment and other agents. This thesis can answer several questions in regards to the demand and performance of ED in the future and provides health care providers with quantitative information on economic impact, affordability, required staff, and physical resources. Prediction of the behavior of patients with NCDs can also be beneficial for health policy to plan for increasing health education in the community, reduce risky behavior, and teaching to make healthy decisions in a lifetime. Prediction of behavior of Spain’s ED in future years can help care providers for decision-makers to improve health care management.


PhD Thesis: Diego Montezanti: March 18, 2020 09:00 (ARG). (Researcher at III-LIDI UNLP)

Title:  Soft Error Detection and Automatic Recovery in High Performance Computing Systems (SEDAR).

TDX Source:


Reliability and fault tolerance have become aspects of growing relevance in the field of HPC, due to the increased probability that faults of different kinds will occur in these systems. This is fundamentally due to the increasing complexity of the processors, in the search to improve performance, which leads to a rise in the scale of integration and in the number of components that work near their technological limits, being increasingly prone to failures. Another factor that affects is the growth in the size of parallel systems to obtain greater computational power, in terms of number of cores and processing nodes.

As applications demand longer uninterrupted computation times, the impact of faults grows, due to the cost of relaunching an execution that was aborted due to the occurrence of a fault or concluded with erroneous results. Consequently, it is necessary to run these applications on highly available and reliable systems, requiring strategies capable of providing detection, protection and recovery against faults.

In the next years it is planned to reach Exa-scale, in which there will be supercomputers with millions of processing cores, capable of performing on the order of 1018 operations per second. This is a great window of opportunity for HPC applications, but it also increases the risk that they will not complete their executions. Recent studies show that, as systems continue to include more processors, the Mean Time Between Errors decreases, resulting in higher failure rates and increased risk of corrupted results; large parallel applications are expected to deal with errors that occur every few minutes, requiring external help to progress efficiently. Silent Data Corruptions are the most dangerous errors that can occur, since they can generate incorrect results in programs that appear to execute correctly. Scientific applications and large-scale simulations are the most affected, making silent error handling the main challenge towards resilience in HPC. In message passing applications, a silent error, affecting a single task, can produce a pattern of corruption that spreads to all communicating processes; in the worst case scenario, the erroneous final results cannot be detected at the end of the execution and will be taken as correct.

Since scientific applications have execution times of the order of hours or even days, it is essential to find strategies that allow applications to reach correct solutions in a bounded time, despite the underlying failures. These strategies also prevent energy consumption from skyrocketing, since if they are not used, the executions should be launched again from the beginning. However, the most popular parallel programming models used in supercomputers lack support for fault tolerance.


PhD Thesis: Jorge Luis Villamayor Leguizamón: November 30, 2018, 12:30. (R&D Product Owner / IT Project Manager at Giesecke+Devrient Mobile Security)

Title: Fault Tolerance Configuration and Management for HPC Applications using RADIC.

TDX Source:

jorge2_1.pngAbstract: High Performance Computing (HPC) systems continue growing exponentially in terms of components quantity and density to achieve demanding computational power. At the same time, cloud computing is becoming popular, as key features such as scalability, pay-per-use and availability continue to evolve. It is also becoming a competitive platform for running parallel HPC applications due to the increasing performance of virtualized, highly-available instances. Although, augmenting the amount of components to create larger systems tends to increment the frequency of failures in both clusters and cloud environments. Nowadays, HPC systems have a failure rate of around 1000 per year, meaning a failure every approximately 8 hours.

Fault Tolerance (FT) techniques need to be applied to MPI parallel executions in both, cluster and cloud environments. With FT techniques, high availability is ensured for parallel applications. In order to apply some FT solutions, administrator privileges are required, to install them in the cluster nodes. Moreover, when failures appear human intervention is required to recover the application. A solution, which minimizes
users and administrators intervention is preferred.

Regarding cloud environments, we propose Resilience as a Service (RaaS), a fault tolerant framework for HPC applications.  RaaS provides clouds with a highly available, distributed and scalable fault-tolerant service. It redesigns traditional HPC protection and recovery mechanisms, to natively leverage cloud capabilities and its multiple alternatives for implementing FT tasks. This thesis contributes on providing a Multi-platform Resilience Manager (MRM), suitable for traditional bare-metal clusters and clouds (public and private). The presented solution provides FT in an automatic, distributed and transparent manner in the application and user levels according to the users, applications, and runtime requirements. It gives the users critical FT information, allowing them to trade-off cost and protection keeping the mean time to repair within acceptable ranges.

Several experimental environments such as bare-metal clusters and cloud (public and private), running different parallel applications were used during the experimental validations. The experiments verify the functionality and improvement of the contributions.

PhD Thesis: Laura María Espínola Brítez : November 30, 2018, 9:30. (R&D QA Manager at Giesecke+Devrient Mobile Security)

Title: Efficient Communication Management in Cloud Environments .

TDX Source:

laura2_0.pngAbstract: Scientific applications with High Performance Computing (HPC) requirements are migrating to cloud environments due to the facilities that it offers. Cloud computing plays a major role considering the compute power that it provides, avoiding the cost of a physical cluster maintenance. With features like elasticity and pay-per-use, it helps to reduce the researchers’ procurement risk. Most of HPC applications are implemented using Message Passing Interface (MPI), which is a key component in common and distributed computing tasks.

However, for this kind of applications on cloud environments, the major drawback is the loss of execution performance, due to the virtualized network that affects the communications latency and bandwidth. In this thesis a Dynamic MPI Communication Balance and Management (DMCBM) is presented, to overcome the communication challenge of HPC applications in cloud. DMCBM is implemented as a middle-ware between the users’ application and the execution environment. It improves message communication latency times in cloud-based systems, and helps users to detect mapping and parallel implementation issues.

Our solution dynamically rebalances communication flows at higher levels of the virtualized HPC stack, e.g. over MPI communications layer, to dynamically remove communication hot-spots and congestion in the underlying layers. DMCBM abstracts the communications state between application processes based on latency measurements. DMCBM achieves lower application execution time in case of congestion, obtaining better performance in clouds.

The NAS Parallel Benchmarks and a real application of dynamic particles simulation NBody are used to show the DMCBM performace, obtaining an improvement of up to 10% in the execution time and a communication time reduction of about 16% in congestion scenarios.


PhD Thesis: Pilar Gómez Sánchez: June 22, 2018, 12:00. (Assistant Professor UAB)

Title: Analyzing the Parallel Applications’ I/O Behavior Impact on HPC Systems.

TDX Source:


The volume of data generated by scientific applications grows and the pressure on the I/O system of HPC systems also increases. For this reason, an I/O behavior model is proposed for scientific MPI (Message Passing Interface) parallel applications. The goal is to analyze the applications’ impact on the I/O system. Analyzing the MPI parallel applications at POSIX-IO level allows observing how the application’s data are treated at that level.

In this research work, the following is presented: the I/O behavior model definition at POSIX-IO level (PIOM-PX model definition), the methodology applied to extract this model and the PIOM-PX-Trace-Tool. As PIOM-PX is based on the I/O phase concept, it can identify the more significant phases. Phases that have more influence than others in the I/O system and they could provoke a bottleneck or a poor performance. Analysis based on I/O phases allows identifying, delimiting, and trying to reduce each phase’s impact on the I/O system.

PIOM-PX is part of proposed model PIOM. PIOM integrates the I/O behavior model at POSIX-IO level (PIOM-PX) and the I/O behavior model at MPI-IO level (PIOM-MP, formerly known as PAS2P-IO). The model provides the information necessary to replicate an application’s behavior in different systems using synthetic programmables programs. PIOM-PX-Trace-Tool allows interception of POSIX-IO instructions used during the application execution. The experiments carried out are executed in several standar HPC systems and the Cloud platform, where it is able to test the utility of the proposed model PIOM.

PhD Thesis: Cecilia Elizabeth Jaramillo Jaramillo, : July 21, 2017, 11:00. (Researcher at Computer Science Department. Universidad ISRAEL. Quito, Ecuador)

Title: Modelización y Simulación de la transmisión por contacto de una infección nosocomial en el servicio de urgencias hospitalarias.

TDX Source:


The nosocomial infection is an infection caused by microorganisms acquired within sanitary environments and is one of the main threats faced by hospitalized patients. Methicillin Resistant Staphylococcus Aureus (MRSA) is one of the most common and dangerous microorganisms in hospital settings and it could causes serious skin, wound, organ and even blood-borne infections (bacteremia).

In  a  healthcare  environment,  such  as  the  emergency  department,  constant interaction between patients, healthcare workers and the environment contributes to MRSA transmission. The most common routes of transmission are the hands of the healthcare workers and contaminated medical instruments or objects of the environment. To counteract the transmission, health services have implemented certain actions called infection control measures.

This research addresses the issue of the transmission of nosocomial infection by  contact  in  a  emergency  service  using  the  capacity  that  agent-based  simulation possesses to represent social phenomena and human dimension. Agent-based computational models allow us to evaluate potential solutions to specific situations in a virtually created environment.

As a result of this research, a simulation tool of contact transmission of MRSA has been obtained, the MRSA-T-Simulator. The main objective of this tool is to allow the construction of virtual scenarios in order to study the phenomenon of MRSA transmission and to evaluate the potential impact of the implementation of different infection control measures on propagation rates.


PhD Thesis: Eva BruBalla: July 21, 2017, 9:30. (Assistant Professor at Gimbernat Schools, Spain)

Title: Scheduling non critical patients’ admission in a hospital emergency department. 

TDX Source:


The increase in life expectancy, the progressive growth of aging and a greater number of chronic diseases are factors that contribute significantly to the growing demand for urgent medical care and, consequently, in many cases, to the saturation of the Emergency Departments (ED). Taking into account also the limitations on available resources, this constant risk of ED saturation is one of the most important current problems in health systems around the world, since it often results in an excessive length of stay of patients in the service and, consequently, generates dissatisfaction.

The results presented in this study aim to contribute to the improvement of the quality of care provided in EDs. We propose a method to try to reduce the total length of stay of the patients in the service, through a model for planning the arrival of non-critical patients to it. The model is based on the detailed characterization of the system in terms of its attention capacity and the number of patients attending each hour dynamically. The use of the simulation allows us to obtain knowledge about the behavior of the system through the prediction of patient waiting times for a specific situation or scenario, determined by the way patients arrive at the service and the available sanitary staff resources. A first contribution of the research is the definition of an analytical model for the calculation of the theoretical throughput of a certain sanitary staff configuration. The objective of this first part of the research is to evaluate the responsiveness of sanitary staff to service demand, depending on the configuration of doctors, nurses, admission and triage personnel, and the model of patient flow throughout the service. The second contribution of the research that we present is the  definition of a model for scheduling the admission of non-critical patients into the service, by their redistribution with respect to the input pattern initially foreseen by the hospital’s historical data. We have been able to verify the effectiveness of  the proposed scheduling model based on the information of the actual data provided by the Hospital de Sabadell, as reference hospital, and using the simulation to assess the results of its application.

The described research contributions offer the ED managers new knowledge about the behavior of the service, which may be relevant in decision making, regarding the improvement of service quality, of a great interest taking into account the expected growing demand of the service in a very near future.


PhD Thesis: Joe Carrion Jumbo: July 20, 2017, 11:00. (Researcher at Computer Science Department. Universidad ISRAEL. Quito, Ecuador)

Title: Mejorando la red de los servicios de motores de búsqueda a través de enrutameinto basado en aplicación.

TDX Source:


Large-scale computer systems like Search Engines provide services to thousands of users,  and their user demand can change suddenly.  This unstable demand impacts sensitively to the service components (like network and hosts). The system should be able to address unexpected scenarios; otherwise, users would be forced to leave the service.  A search engine has a typical architecture consisting of a Front Service, that processes the requests of users, an Index Service that stores the
information collected from the internet and a Cache Service that manages the efficient access to  content  frequently  used.  

The  scientific  advances  that  provide  these  services  are  in  general emergent technology.  The network services of a search engine require specialized planning; This research is carried out by studying the traffic pattern of a Search Engine and designing a routing model for messages between network nodes based on the data flow conditions of the Search Engine Service.  The expected result is a network service specialized in the traffic of a Search Engine that allocates network resources efficiently according to demand it supports in real time.  The evaluation of the traffic pattern allowed us to identify conditions of unbalance of the network and  congestion  of  messages.  

Therefore  model  designed  combines  different  routing  models  of the  literature  and  a  new  criteria  based  on  the  specific  conditions  of  the  traffic  of  the  Search Engine.  For the design of this proposal it has been necessary to design a scale model of a Search Engine using simulation techniques and It has has used traffic from a real system that allowed
us to accurately evaluate the proposed model and compare it with currently available routing models in the literature and technology.  The results show that the proposed model improves the performance of the Search Engine network in terms of latency and network throughput.

PhD Thesis supervised by members of the group: