Gagan Saini

Data Scientist & ML Engineer

I build fraud detection systems and ML-powered data pipelines across AWS, Azure, and GCP.

United Kingdom — Open to relocation

View Projects Contact Me

About Me

MSc Data Science candidate at the University of Aberdeen with 2+ years of professional experience building ML-powered cloud optimisation pipelines at CloudEQ Software. I specialise in fraud detection, ETL architecture, and applied machine learning across AWS, Azure, and GCP. I made a self-driven career pivot from Civil Engineering into data science, and I'm focused on shipping ML systems that solve real business problems.

Experience

Data Engineer / DevOps Engineer — CloudEQ Software India Pvt. Ltd.

Jul 2022 – Dec 2024 | India

  • Engineered ML-powered cloud cost optimisation pipelines, reducing infrastructure spend by 35% across multi-cloud environments (AWS, Azure, GCP)
  • Designed and deployed automated ETL workflows processing 50K+ daily records with 99.7% data integrity
  • Built CI/CD pipelines using Jenkins, Docker, and Terraform, cutting deployment time by 40%
  • Developed real-time monitoring dashboards integrating CloudWatch, Azure Monitor, and GCP Operations Suite, improving incident response time by 30%
  • Implemented infrastructure-as-code practices across 15+ client environments, reducing provisioning errors by 45%

Projects

Synthetic Fraud Detection Dataset Generator

A Python-based, configuration-driven system that generates 100K+ row synthetic datasets for music streaming platform fraud detection. Features a VocabularyFactory class using procedural generation, DatasetConfig dataclass for full configurability, deterministic output with SEED=42, and a 20-column schema across legitimate and fraudulent behaviour. Models four fraud scenarios: bot-driven stream manipulation, artist impersonation, content duplication, and metadata stuffing. Enables ML teams to train and benchmark fraud classifiers without accessing sensitive production data.

Tech: Python, NumPy, Pandas

Certifications

Skills & Technologies

Languages: Python, SQL, Bash

ML / Data Science: Scikit-learn, XGBoost, Pandas, NumPy, TensorFlow, NLP

Cloud: AWS (EC2, S3, Lambda, SageMaker, CloudWatch), Azure (ML Studio, Data Factory, Monitor), GCP (BigQuery, Dataflow, Operations Suite)

Data Engineering: ETL/ELT Pipelines, Apache Airflow, Data Warehousing, Data Modelling

DevOps / MLOps: Docker, Kubernetes, Terraform, Jenkins, CI/CD, GitHub Actions

Databases: PostgreSQL, MySQL, MongoDB, Redis

Tools: Git, Jupyter, VS Code, Linux

Education

MSc Data Science — University of Aberdeen, UK (Expected June 2026)

B.Tech Civil Engineering — Ajay Kumar Garg Engineering College, Dr. A.P.J. Abdul Kalam Technical University, India (2017–2021)

Contact

Email: sainigagan163@gmail.com

Phone: +44 7823916494

GitHub: github.com/sainigagan163

LinkedIn: linkedin.com/in/gagansaini29

Location: United Kingdom