Senior Data Engineer | AWS Certified

Transforming Data Into Revolutionary Solutions

Experienced Data Engineer specializing in Big Data, Real-time Streaming, and Cloud Architecture. Building scalable data solutions that drive business value.

0
Million+ Records Processed
0
TB Data Optimized
0
Years Experience
Scroll to explore
Ankit Sharma - Senior Data Engineer

Senior Data Engineer

Big Data Engineer with 7+ years of experience architecting and implementing large-scale data solutions. Expert in real-time streaming pipelines, ETL frameworks, and cloud-native architectures processing 100M+ records daily. AWS Certified with proven expertise in Spark, Kafka, Airflow, and Snowflake. Passionate about building scalable, high-performance data systems that drive business value.

AI & Machine Learning

Advanced algorithms and neural networks

Big Data Engineering

Scalable data pipelines and analytics

Full-Stack Development

Modern web and cloud applications

Professional Experience

Building world-class solutions at leading global organizations

Apr 2024 - Present

Senior Data Engineer

Personify Health

Leading data engineering initiatives processing 100M+ healthcare records with ETL pipelines using Spark, Airflow, AWS S3, and Snowflake.

  • Developed "Project Eraser" in Snowflake - removed 169TB of unnecessary data, optimizing storage and costs
  • Built real-time Kafka streams using Apache Airflow for data ingestion and transformation
  • Orchestrated complex ETL workflows with Airflow DAGs, improving data pipeline reliability
  • Implemented data validation mechanisms ensuring 100% pipeline reliability and data quality
Apache Spark Apache Airflow AWS S3 Snowflake Apache Kafka Python
May 2023 - Apr 2024

Senior Software Engineer

Airtel Africa

Architected real-time streaming pipelines with Apache Kafka and Spark Streaming for Africa's largest telecom network.

  • Built real-time streaming pipeline using Apache Kafka and Spark ensuring efficient data processing
  • Implemented manual checkpoints for fault tolerance and precise tracking of processing progress
  • Designed modular data pipelines using Spark, Hive, HDFS, Trino, and SQL Server on Cloudera
  • Led cleanup of large-scale HDFS datasets - removed 5TB of outdated Hive metadata, improving performance
Apache Kafka Spark Streaming Apache Hive HDFS Trino Cloudera
Dec 2022 - May 2023

Data Engineer

The Math Company

Led ETL data migration projects and implemented lakehouse architectures for Fortune 500 clients.

  • Spearheaded ETL data migration project - client previously used QlikSense as ETL tool for 50 applications
  • Implemented Delta tables lakehouse architecture using AWS Data Lake for processing engine
  • Data stored in Delta Lake with query optimization using Databricks and Spark Catalog
  • Developed four microservices using dashboards in Databricks for efficient data access
AWS Data Lake Databricks Delta Lake Apache Spark QlikSense Python
May 2021 - Apr 2022

Data Engineer

Accenture Research

Built REST APIs and streaming solutions for real-time data processing and analytics platforms.

  • Developed secure Spark REST API using Python for structured and unstructured data processing
  • Implemented Kafka-based streaming mechanisms ensuring 100% pipeline reliability
  • Built documentation and data extraction/transformation pipelines with strategic data-driven decisions
  • Processed data across 5+ sources with Spark SQL streaming for real-time analytics
Apache Spark Python REST API Apache Kafka Spark SQL Streaming

Revolutionizing Technology

Building world-class solutions that transform industries and empower millions

01

AI Education Platform

Building a comprehensive learning platform to democratize AI and machine learning education for everyone.

02

Industry Solutions

Developing enterprise-grade AI solutions that transform industries and drive innovation at scale.

03

Community Building

Creating a global community of AI practitioners, researchers, and enthusiasts to collaborate and innovate.

04

Research & Innovation

Conducting cutting-edge research in AI, ML, and deep learning to push the boundaries of what's possible.

Groundbreaking Solutions

World-class products powered by cutting-edge AI, ML, and data engineering

AI Studio

No-code platform for building, training, and deploying machine learning models.

ML AutoML Cloud
Learn More

Neural Insights

Deep learning framework for building sophisticated neural network architectures.

Deep Learning PyTorch GPU
Learn More

Predictive Analytics

Business intelligence platform powered by advanced predictive modeling.

BI Forecasting Insights
Learn More

NLP Engine

Natural language processing toolkit for text analysis and understanding.

NLP Transformers API
Learn More

Vision AI

Computer vision platform for image recognition and object detection.

CV Detection Recognition
Learn More

Education & Certifications

Academic foundation and professional certifications in data engineering and cloud technologies

Bachelor of Technology

Computer Science - 70%

Jaipur National University

Aug 2018 - Jun 2020

Specialized in Computer Science with focus on algorithms, data structures, and software engineering principles.

AWS Certified

Machine Learning Specialty

MLS-C01

Dec 2021 - Dec 2024

Expert-level certification demonstrating ability to design, implement, deploy, and maintain machine learning solutions on AWS.

AWS Certified

Associate Developer

DVA-C01

Valid 2021

Validates expertise in developing, deploying, and debugging cloud-based applications using AWS services.

Expert Technology Stack

Programming Languages

Python SQL C++ Java Object-Oriented Programming Functional Programming

Databases & Storage

MySQL PostgreSQL MongoDB AWS RDS Azure SQL Cassandra HDFS Hive AWS S3 Data Lake

Big Data & Streaming

Apache Spark Apache Kafka Apache Airflow Apache Hive Trino Snowflake Databricks

Machine Learning & AI

Machine Learning Deep Learning NLP Computer Vision ORM DBMS System Design

Tools & Frameworks

Data Structures & Algorithms Data Security Computer Networking ETL Data Modeling Problem Solving Data Warehouse Kafka Hadoop

Latest Articles & Insights

Deep dives into AI, Big Data, Machine Learning, and emerging technologies

AI Future

The Future of AI: Beyond Deep Learning

Exploring the next generation of AI technologies and their potential to revolutionize industries...

Read Article
Big Data

Scaling Data Pipelines with Apache Spark

Best practices for building robust, scalable data pipelines that handle billions of records...

ML Models

Advanced ML Model Optimization Techniques

Deep dive into hyperparameter tuning, model compression, and deployment strategies...

Neural Networks

Building Transformers from Scratch

Understanding the architecture that powers modern NLP models like GPT and BERT...

Let's Create Magic Together

Looking for world-class data engineering expertise? Need AI/ML consultation? Want to build revolutionary solutions? I'm here to transform your vision into reality with cutting-edge technology.

Email

ankitsharma77753@gmail.com

Phone

+91 XXX XXX XXXX

Location

India

Available for consultation and speaking engagements