Data Engineer I — CloudEQ Software India Pvt. Ltd.
Jul 2022 – Dec 2024 | India
- Designed a machine learning-based resource clustering algorithm in Python utilizing Scikit-learn, TensorFlow, and DBSCAN to detect utilization patterns for 500+ cloud resources, uncovering cost-saving patterns that led to a 25% decrease in client cloud spend
- Created automated ETL flows using Python and SQL for the collection, cleaning, and transformation of massive infrastructural telemetry from AWS, Azure, and GCP API endpoints — saving up to 40% time on data preparation for statistical analysis
- Created an anomaly detection system using Python applying threshold-based statistical models to live metrics data of 500+ monitored infrastructural elements, detecting performance degradation trends and decreasing incident response time by 30%
- Used feature engineering and clustering algorithms in Python for multi-source cloud utilization data to create clusters of similar infrastructures, providing structured analytics data for further analysis
- Implemented scalable data pipelines for ingestion of structured data using Microsoft Fabric Data Factory technology