Currently pursuing a PhD in Computing at Federal Fluminense University. Senior Data Science professional with expertise in Machine Learning, Data Analytics, and AI/LLM applications. I combine strong technical foundations with practical experience in both government and academic environments, working on impactful projects involving data-driven decision making, NLP, and explainable AI. Currently focused on optimizing LLM fine-tuning for energy efficiency and building interpretable ML solutions.
📌 Current Focus:
- 🔬 Researching LLM Fine-tuning with focus on Energy Efficiency and Green AI
- 🔭 Building AI/Analytics solutions at TCERJ for Auditing & Control (Government sector)
- 👨🏫 Teaching Data Analytics at ESPM and Data Science in UFF's graduate program
- 🌱 Mastering Cloud Computing (Azure & GCP)
- 💻 Interesting Self-Analytics, NLP and Explainable AI
|
Otimização de Hiperparâmetros para Modelos de Linguagem Implementação reproduzível de BERT-PLI para busca em grade exaustiva de hiperparâmetros com execução paralela em GPU, rastreamento de recursos e análise automática de resultados. Executa centenas de combinações de hiperparâmetros com monitoramento de energia. Tech Stack: Python, PyTorch, Transformers, CodeCarbon, Weights & Biases |
Framework para Evolução de Árvores de Decisão Biblioteca Python para evolução de árvores de decisão utilizando algoritmos genéticos, permitindo otimização automática de modelos interpretáveis. Tech Stack: Python, Scikit-Learn, NumPy, Genetic Algorithms |
|
Pipeline de Pré-processamento de Dados Jurídicos Pipeline modular de pré-processamento para textos jurídicos em C++ com orquestração paralela via grafo de dependências. Implementa execução sequencial e paralela com particionamento de dados, alcançando speedup de até 5.24x. Tech Stack: C++17, CMake, Makefile |
Sistema de Fine-tuning para LLMs com Monitoramento Energético Framework completo de fine-tuning de modelos de linguagem (LLaMA 3.2 3B) com LoRA, incluindo pré-processamento avançado de dados, monitoramento de consumo energético sincronizado e rastreamento de emissões de CO₂. Tech Stack: Python, LangChain, HuggingFace, PyTorch, Transformers, CodeCarbon |
| Area | Description | Technologies |
|---|---|---|
| Machine Learning | Model development, feature engineering, hyperparameter optimization | Scikit-learn, XGBoost, LightGBM, DecisionTree |
| Deep Learning | Neural networks, transfer learning, fine-tuning | TensorFlow, PyTorch, Transformers |
| NLP & LLMs | Text processing, tokenization, LLM fine-tuning, prompt engineering | HuggingFace, LangChain, Gemini, LLaMA |
| Data Analysis | Exploratory analysis, statistical inference, dashboarding | Pandas, NumPy, Plotly, Dash |
| Explainable AI | Model interpretability, SHAP, feature importance | LIME, SHAP, TreeExplainer |
| Green AI | Energy-efficient training, carbon footprint tracking, sustainable ML | CodeCarbon, Energy monitoring |
| Data Engineering | ETL pipelines, data cleaning, preprocessing | Python, SQL, Apache tools |
📖 Medium: @gassanttos
Check out my articles on Machine Learning, Data Science, and AI topics!
- 💚 Sustainable AI and Green Computing practices
- 🔬 Advanced LLM architectures and techniques
- 🧬 Graph Neural Networks and knowledge graphs
- ☁️ Cloud Computing (Azure & GCP)
Open to collaborations on ML/NLP projects, open-source contributions, and knowledge sharing!


