Data Engineer | MS Data Science @ GWU (GPA 3.9) | Washington, DC
I design and build robust data pipelines and warehousing solutions that transform raw data into actionable insights. With 2+ years of hands-on experience in ETL, data modeling, and real-time processing, I've optimized workflows to reduce latency by 30% and ensured data accuracy across enterprise systems.
I'm a data engineer passionate about building efficient, scalable data infrastructure. Starting my career at Tata Consultancy Services (TCS), I developed expertise in SQL, ETL pipeline design, and data warehousing using Star Schema. Now, I'm leveraging Python, PySpark, and advanced analytics to solve complex data challenges at scale.
Data Engineering: ETL Pipeline Design · Data Warehousing (Star Schema) · Data Modeling · Real-time & Batch Processing · Automated Data Validation
Languages & Databases: SQL (Advanced) · Python (Pandas, PySpark, OpenCV) · Java · SQL Server · MongoDB
Tools & Platforms: Informatica Cloud · Git · Linux · Docker · Jira · Power BI
Certifications & Awards: Star of The Month (TCS) · OSPO Award (Real-Time Danger Detection System)
OSPO Award Winner
- Engineered a real-time computer vision pipeline using Python, OpenCV, and YOLOv3 to detect hazardous conditions
- Built automated notification system with instant alerts and visual evidence
- Recognition: OSPO Award and cash prize for technical innovation in public safety
- Designed a Star Schema data warehouse with optimized ETL procedures
- Created query-ready datasets for business intelligence and analytics
- Optimized data access and retrieval workflows for efficient query performance
- Analyzed 12+ million records using PySpark and Parquet to identify high-density traffic zones
- Implemented complex spatial transformation logic within ETL workflows
- Demonstrated capability with large-scale data processing and feature engineering
GWU Data Science Capstone · Aug 2025 – Present
- Designed a Reinforcement Learning (RL) pipeline to perform pseudo-labeling, improving downstream model accuracy compared to standard self-training baselines
- Automated the generation of high-confidence labels for unlabeled data, reducing reliance on manual annotation
- Benchmarked RL against non-RL approaches to quantify trade-offs in accuracy and efficiency, delivering a reproducible workflow
GWU, Washington, DC · Jan 2025 – May 2025
- Applied machine learning and transformer-based NLP models to classify math problems into eight categories
- Developed data preprocessing and evaluation pipelines for model comparison and reproducibility
- Documented methodologies and findings for academic dissemination
GWU
- Built Python-based thermal models to analyze sensible–latent heat storage systems and assess discharge efficiency
- Applied machine learning optimization to predict system performance under varying conditions
- Achievement: Paper accepted for poster presentation at the 7th Battery and Energy Storage Conference, 2025
GenAI / LLMs
- Developed end-to-end Generative AI pipeline using OpenAI models and LangChain to automatically generate transcripts, summaries, and viral clips from long-form video content
- Optimized system prompts to improve summarization accuracy and context retention by 20%
- Fabrication and Study of PCM-based Waste Heat Recovery System – POCER 2019, Nottingham University
- Thermal Energy Storage System Performance Using PCM-Variable Heat – POCER 2019, Nottingham University
- Paper accepted for poster presentation at the 7th Battery and Energy Storage Conference, 2025
- 🏆 OSPO Award – Real-Time Danger Detection System with cash prize
- ⭐ Star of The Month – Tata Consultancy Services (TCS)
- 🎓 Global Leaders Award – George Washington University (Scholarship)
Master of Science, Data Science
George Washington University, Washington, DC · December 2025
GPA: 3.9 | Honors: Global Leaders Award (Scholarship)
Operations Lead | Google Developer Groups
August 2024 – Present
- Directed the "Build with AI" conference with C-suite executives (CEOs/CTOs)
- Moderated executive panel on emerging tech trends and enterprise scalability
- Collaborated with university departments on ethical, inclusive technology learning
Volunteer | IEEE Student Chapter
January 2018 – May 2020 · JNTU, India
- Coordinated technical workshops and hackathons for 200+ participants
- Managed stakeholder communication and event scheduling
📧 Email: [email protected]
📱 Phone: 571-274-9816
💼 LinkedIn: https://www.linkedin.com/in/deepikardy7129
📍 Location: Washington, DC

