Skip to content
View gassantos's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report gassantos

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
gassantos/README.md

👋 Hi, I'm Gustavo Alexandre

Currently pursuing a PhD in Computing at Federal Fluminense University. Senior Data Science professional with expertise in Machine Learning, Data Analytics, and AI/LLM applications. I combine strong technical foundations with practical experience in both government and academic environments, working on impactful projects involving data-driven decision making, NLP, and explainable AI. Currently focused on optimizing LLM fine-tuning for energy efficiency and building interpretable ML solutions.

📌 Current Focus:

  • 🔬 Researching LLM Fine-tuning with focus on Energy Efficiency and Green AI
  • 🔭 Building AI/Analytics solutions at TCERJ for Auditing & Control (Government sector)
  • 👨‍🏫 Teaching Data Analytics at ESPM and Data Science in UFF's graduate program
  • 🌱 Mastering Cloud Computing (Azure & GCP)
  • 💻 Interesting Self-Analytics, NLP and Explainable AI

🚀 Featured Projects

Otimização de Hiperparâmetros para Modelos de Linguagem

Implementação reproduzível de BERT-PLI para busca em grade exaustiva de hiperparâmetros com execução paralela em GPU, rastreamento de recursos e análise automática de resultados. Executa centenas de combinações de hiperparâmetros com monitoramento de energia.

Tech Stack: Python, PyTorch, Transformers, CodeCarbon, Weights & Biases

Framework para Evolução de Árvores de Decisão

Biblioteca Python para evolução de árvores de decisão utilizando algoritmos genéticos, permitindo otimização automática de modelos interpretáveis.

Tech Stack: Python, Scikit-Learn, NumPy, Genetic Algorithms

Pipeline de Pré-processamento de Dados Jurídicos

Pipeline modular de pré-processamento para textos jurídicos em C++ com orquestração paralela via grafo de dependências. Implementa execução sequencial e paralela com particionamento de dados, alcançando speedup de até 5.24x.

Tech Stack: C++17, CMake, Makefile

Sistema de Fine-tuning para LLMs com Monitoramento Energético

Framework completo de fine-tuning de modelos de linguagem (LLaMA 3.2 3B) com LoRA, incluindo pré-processamento avançado de dados, monitoramento de consumo energético sincronizado e rastreamento de emissões de CO₂.

Tech Stack: Python, LangChain, HuggingFace, PyTorch, Transformers, CodeCarbon


💻 Technology Stack

Python & ML Libraries:

LLM & NLP Frameworks: LangChain HuggingFace Transformers

Databases & Data Tools:

Web & Visualization: Flask Plotly Dash SQL

Cloud & DevOps: Azure GCP Docker

Systems & Low-Level:

Monitoring & Analytics: CodeCarbon Weights & Biases


🎯 Areas of Expertise

Area Description Technologies
Machine Learning Model development, feature engineering, hyperparameter optimization Scikit-learn, XGBoost, LightGBM, DecisionTree
Deep Learning Neural networks, transfer learning, fine-tuning TensorFlow, PyTorch, Transformers
NLP & LLMs Text processing, tokenization, LLM fine-tuning, prompt engineering HuggingFace, LangChain, Gemini, LLaMA
Data Analysis Exploratory analysis, statistical inference, dashboarding Pandas, NumPy, Plotly, Dash
Explainable AI Model interpretability, SHAP, feature importance LIME, SHAP, TreeExplainer
Green AI Energy-efficient training, carbon footprint tracking, sustainable ML CodeCarbon, Energy monitoring
Data Engineering ETL pipelines, data cleaning, preprocessing Python, SQL, Apache tools

📚 Publications & Content

📖 Medium: @gassanttos

Check out my articles on Machine Learning, Data Science, and AI topics!


🤝 Get in Touch

LinkedIn GitHub Medium


🎓 Currently Learning

  • 💚 Sustainable AI and Green Computing practices
  • 🔬 Advanced LLM architectures and techniques
  • 🧬 Graph Neural Networks and knowledge graphs
  • ☁️ Cloud Computing (Azure & GCP)

Open to collaborations on ML/NLP projects, open-source contributions, and knowledge sharing!

Pinned Loading

  1. gridsearch-skyband gridsearch-skyband Public

    The GridSearch Skyband project optimizes language model pipelines through a reproducible PyTorch implementation. It automates exhaustive hyperparameter searches and parallel GPU execution while mon…

    Python

  2. finetuning-energy finetuning-energy Public

    Sistema de fine-tuning para LLM com foco em eficiência energética.

    Python

  3. graph_priority_queue graph_priority_queue Public

    Pipeline de textos jurídicos usando fila de prioridade para escalonar tarefas em um grafo de dependência.

    C++

  4. evolvedtree evolvedtree Public

    It is a machine learning model combines two computational intelligence approaches: Genetic Algorithm and Decision Tree. The nome of model (EvolveDTree) represents a acronymous to "Evolved Decision …

    Jupyter Notebook 6 5