About me

Since January 2022, I have joined to Getlife, the insuretech with more growth during 2021. I am the Head of Data. Previously, I was the Technical Manager Data Analytics & AI of Gradiant. Before that, I have worked mainly as data analyst focused on the prediction of events in the Big Data framework, both in the TIC area (Optare Solutions Company) and in the management of energy (Ecomanagement Technology company). During these period, I have combined my work with researching and teaching as a member of the Department of Statistics and O. R., University of Vigo. I hold a PhD from the University of Vigo in Statistics and O.R. I have wide experience solving real life problems by means of the use and the development of new techniques in the fields of Machine Learning and Artificial Intelligence techniques. Particularly, my research lines are focused on novel contributions related to predictive modeling, computational statistics, nonparametric curves estimation, survival analysis, software development, all of them with application to cybersecurity, between other areas of knowledgment. My contributions have been published in several articles within high impact journals resulting from her participation in research projects and international conferences as active member of the SiDOR group (Statistical Inference, Decision and Operations Research).

Research

Journal articles

PhD thesis

Technical documents


Software


  • clustcurv package. Author and mantainer of this package (https://cran.r-project.org/web/packages/clustcurv/) for determining groups in multiple curves with an automatic selection of their number based on k-means or k-medians algorithms. The selection of the optimal number is provided by bootstrap methods. The methodology can be applied both in regression and survival framework. Implemented methods are: Grouping multiple survival curves described by Villanueva et al. (2019).

  • npregfast package. Website link Author and maintainer of this R package (http://cran.r-project.org/web/packages/npregfast/) to perform nonparametric estimation for analyzing interactions factor-by-curve. npregfast allows the user to obtain nonparametric estimates using local linear kernel smoothers and compare them between factor’s levels. Also a feature of the package is its ability to draw inference about critical points, such as maxima or change points linked to the derivative curves. The inference (confidence intervals and tests) is based on bootstrap. This package allows not only to obtain smooth estimates also based on classical parametric models, as allometric model, one of the most used models in biology frameworks usually used to study the relationship between two biometrical variables. Additionally, we have implemented binning type acceleration techniques.

  • FWDselect package. Author and mantainer of this package (http://cran.r-project.org/web/packages/FWDselect/), an R package that introduces a simple method to select the best model or best subset of variables using different types of responses (gaussian, binary or poisson) and applying it in different contexts (parametric or nonparametric).

  • seq2R package. Author of this R package (http://cran.r-project.org/web/packages/seq2R) to detect compositional changes in genomic sequences. This software is useful for loading .fasta or .gbk files, and for retrieving sequences from GenBank dataset. The package allows to detect differences or asymmetries based on nucleotide composition by using local linear kernel smoothers. Also, it is possible to draw inference about critical points (i. e. maximum or minimum points) related with the derivative curves. Additionally, bootstrap methods have been used for estimating confidence intervals and speed computational techniques (binning techniques) have been implemented in seq2R.