Introduction

I use Python as my main language for almost all of the worflows I have. However, I also use R to perform data exploration (EDA) and also to create visualistaions and plots for presentations.

Main skillset

I have extensive experience in the design, testing and deployment of predictive models. I got this experience by participating in Kaggle competitions, learning with Google Brain training material, among others. Everything I learned allowed me to carry out different projects and jobs for companies and individuals, giving me an accumulated experience of 3 years in the field.

I use Jupyter Notebooks in order to design and testing the base model, as well as designing a prototype of the data pipelines. I usually work on 2 platforms for this: VertexAI from Google Cloud and cloud9 from AWS.

Big Data Tools and Environments

Most of the times, the data I work with consists of several Thousands of Gigabytes (TB). Specially when training the model. This data needs to be processed in order to be used as training set. For Big Data I use the following technologies and packages.

You can check out my work here

My work

For the previous works I have used keras, a high-level layer to design both the architecture of the model as well as the preprocessing layer and the creation of features. I also use XGBoost-based models all the time, from classifiers to regressors. However. my main forte is Deep Learning.

Here are some models I designed for my projects and past jobs.

Sales Forecasting

My task was to design a predictive model of sales by product of a retail chain. The dataset consists of the history of sales by product, region, category and others. This case consists of temporary (historical) data, so I designed a pipeline to normalize and deseasonalize the data.

Seasonal decompose of the weekly sales.

The final predictions are shown below.


Movie Recommendation System

Source Code

In this project I designed a Deep Learning approach to movies recommendation on pair with the user-movies sparse matrix The goal is to show a ranking of the movies for each user with the highest predicted ratings (predictions made by the model).

The arquitecture of the final model is shown below:

This is one of the most interesting models I have done. In order to achieve my performance goal, I had to deal with inmense sparse matrix: The average user watches no more than 50 movies normally, however there are milllios of movies. Some of them with billions of views, whereas others with mere thousands. With feature engineering and data wrangling (including densing the inputs) my model achieved great results. And currently is pipelined to be deployed on a web-based app.


Weather Forecasting for Peruvian Amazon

Source Code

Together with other students from my university, we designed an end-to-end machine learning solution to provide forecasting services for meteorological phenomena in Amazonian towns. The following model predicts the temperature. Besides this one, we design models to predict radiation, precipitation and wind speed vector.

Normalized feature columns

Model testing

Instead of using an ARIMA model I proposed a supervised approach and designed a neural network architecture. This implied doing more feature engineering (which I love), like encoding the datetime variable into a periodic funciton, finding the best number of lags, etc.


Stroke blood Cloth Image Classfication

This model is made in order to classify the blood clots origins in ischemic stroke between two acute ischemic stroke subtypes: cardiatic and large artery atherosclerosis.

The dataset consists of hundreds of full slide pictures of the blood clots. Some of them are files in the order of the gigabytes. Thus, the preprocessing layer has to be efficient and able to find the optimal region where the blood is present.

The pipeline scans the full image (from the left) and finds a region that contains a considerable ammount of blood.

Arquitecture of model: Convolutional Neural Network with dense output.

My contact info is over here

Contact

Currently I am open to job offers both full and part time, and my disponibility is inmmediate. However I would prefeer a full time - full remote position.

I would be more than happy if you request more information about my work.

Also, schedule a virtual interview to discuss my past and current projects, experience and skills would be amazing!

Social Media

These are kaggle and github accounts:


If you prefeer, you can text me at my phone to schedule a call or whatsapp me, my phone number is the following.

(+51) 989-312-330Phone