About Me
Hey there, I'm Prakhar Srivastava, and I've been working in Data Engineering for 4 years now. I pursued my Master's in Computer Science with a focus on Artificial Intelligence at the University of Tartu in Estonia.
I'm passionate about building efficient data pipelines and infrastructure solutions that make a real impact. My expertise lies in designing systems that handle large volumes of data smoothly, ensuring everything runs seamlessly behind the scenes. I love diving into the latest tech trends to keep pushing the boundaries of what's possible in the world of data engineering.
Employment History
- Data Engineer at EyeVi Technologies, Estonia
EyeVi Technologies is an Estonian startup founded in 2020 with the mission to improve road network through innovative AI-driven mapping technologies.
Develop and optimize ETL pipelines for processing and analyzing geospatial data including road panoramas, traffic signs, and road conditions, ensuring data accuracy and consistency in collaboration with data scientists. Implement performance improvements and maintain scalable data workflows to support large-scale data integration projects involving 50TB of data.
Standardize road data from data warehouses, addressing inconsistent kilometer markers, varied traffic sign formats, and high-resolution panoramas. Create automated validation tools to correct misaligned markers, blurred signs, and inaccurate condition reports, ensuring reliable datasets, saving an estimated 40 hours per month in manual validation and corrections.
Reduced ETL processing time by 30% and increased data accuracy by 25%. Deliver high-quality road data to major US and EU clients, including Google, Prointec, Idea, and Xais, bolstering the company's market reputation and client satisfaction.
Tools: GCP (BigQuery, Looker), Snowflake, DBT, Python, PostgreSQL, Airflow, Terraform, Docker, Kubernetes, Linux, Git.
- Machine Learning Intern at Kappazeta, Estonia
KappaZeta, an Estonian startup established in 2015, leverages satellite technology with AI to help farmers boost agricultural efficiency and sustainability.
Engineered and deployed advanced machine learning algorithms for cloud removal and crop delineation in satellite images, encompassing data preprocessing, model training, and integration into a satellite image monitoring platform. Processed over 1.3 million square kilometers of satellite imagery, covering Estonia, Poland, Sweden, and Spain.
Created an AI-based cloud mask processor to mitigate cloud interference in Sentinel-2 satellite imagery, enhancing image clarity and accuracy. Engineered solutions for identifying and mapping crop boundaries in noisy and varying-resolution TIFF images, improving crop monitoring precision for insurance companies, achieving a feedback score of 4.8 out of 5.
Increased the accuracy of cloud detection by 35% and crop delineation models by 40%, resulting in higher quality satellite data and reliable crop monitoring. Recognized by supervisors for innovative and effective implementation.
Tools: AWS (S3, SageMaker), Pytorch, Python, GAN, Transformer, CNN, Docker, Git.
- Research Assistantship at University of Tartu, Estonia
Researched with a PhD student to optimize a real-time pedestrian detection model using transformer architectures. Tasks included data preprocessing (augmenting and normalizing large datasets), model training (experimenting with different transformer configurations), and utilizing a high-performance computing cluster with 4 GPUs.
Addressed the challenge of detecting pedestrians in real-time under diverse environmental conditions, optimizing the model to handle variations in lighting and occlusions. This involved integrating attention mechanisms and fine-tuning hyperparameters to improve detection accuracy and reliability, achieving the inference speed of 30 FPS.
Enhanced the pedestrian detection model's accuracy by 12% and reduced inference time by 25%, contributing to a significant improvement in real-time processing capabilities.
Tools: Google Colab, Pytorch, Python, NumPy, Pandas.
- Data Engineer (Associate Consultant) at Atos | Syntel, India
Project type: Telecom
Developed and fine-tuned ETL pipelines to process and analyze large volumes of telecom data for the Dubai-based client DU, including tariff plans, call detail records (CDRs), billing information, and subscriber data. This ensured efficient data processing, which supported decision-making and operational efficiency, resulting in an annual revenue increase of $2M.
Resolved data inconsistencies and latency by redesigning ETL workflows, implementing robust validation checks, and optimizing SQL queries, resulting in more reliable and faster data processing, enhancing the accuracy and timeliness of analytics and reporting, and saving an average of 15 hours per week in data processing time.
Reduced data processing time by 40%, improved data accuracy and customer segmentation by 30%, leading to a 15% increase in targeted marketing campaign effectiveness.
Tools: AWS, Spark, Snowflake, Python, SQL, Airflow, Docker, ElasticSearch, Tableau, Kubernetes, Linux.
- Data Analyst (Associate Consultant) at Atos | Syntel, India
Project type: Healthcare
Analyzed large datasets including patient records, prescription data, and delivery logistics for US-based client company IHA Independent Health to optimize e-medicine delivery processes, maintained data pipelines, and collaborated with cross-functional teams to create KPIs using PowerBI, resulting in a 15% increase in sales and improved decision-making.
Solved delivery delay issues by analyzing data on delivery times and traffic patterns, then implementing a predictive model for delivery times and route optimization, which reduced delivery times by an average of 15%.
Achieved a 20% increase in customer satisfaction ratings, and a 15% reduction in operational costs through the integration of a scheduling system and advanced delivery tracking.
Tools: Python, SQL, Microsoft Excel, SAS Enterprise Miner, Airflow, PowerBI, Oracle database, Linux.