Hi, I'm Anand Raj.
A
Machine Learning Engineer with a robust technical toolkit and a passion for leveraging Natural Language Processing, and Computer Vision to drive intelligent solutions. Proficient in programming languages like Python, and C++, I excel with cutting-edge frameworks and libraries such as PyTorch, TensorFlow, Hugging Face, LangChain, NLTK, and spaCy. I also bring expertise in cloud platforms (AWS, GCP), data visualization tools, and agile product development methodologies. Eager to contribute my skills to dynamic ML/NLP roles, I thrive on transforming complex data into actionable insights.
About
I'm a software engineer with a strong background in machine learning, deep learning, and software engineering. I have experience building machine learning solutions to solve business problems, and I am skilled in natural language processing and generative AI. Additionally, I have developed advanced driver assistance systems like Emergency Brake Assist and Rear Pre-Crash Predict.
- Programming languages: R, Python, and C++.
- Database: SQL, MongoDB, and Neo4j (Graph Database).
- Libraries and Tools: NumPy, Pandas, Matplotlib, Sklearn, Folium, Plotly, PyTorch, Keras, TensorFlow, NLTK, spaCy, genism, Hugging Face, LangChain, Databricks, Tableau, Flask, AWS, GCP.
- Product Development: Agile Methodology, Product Life Cycle, Jira, Confluence, Git, GitHub.
Experience
- Designed and developed scalable AI/ML solutions for dynamic article labeling, automating customer query resolution and generating personalized marketing messages in Databricks using PySpark by leveraging AWS S3 for data storage.
- Developed a dynamic labelling system using LLMs, iteratively generating new labels for articles based on previous batches, resulting in a 90% reduction in manual labelling efforts.
- Automated customer query resolution by fine-tuning a pre-trained LLM (Flan-T5) on 34,000 customer queries and resolutions, applying Parameter Efficient Fine Tuning (LoRA) to optimize model performance and deployed the model into production on databricks leading to a 60% reduction in the customer care team's workload by improving response accuracy and efficiency.
- Implemented a statistical modeling solution to predict customer engagement with promotional emails, using a dataset with 30 million customers. Leveraging advanced feature engineering techniques and machine learning algorithms to optimize prediction accuracy, aimed at improving targeting strategies for marketing campaigns.
- Worked on Advanced Driving Assistance Systems and developed products like Emergency Brake Assist, Blind Spor Monitoring and Rear Pre-Crash Predict. Major products: Volkswagen ID Buzz and Mercedes Benz Sprinter Van.
- Engineered and validated safety-critical functions for multiple OEMs, driving robust requirements engineering and compliance with industry standards (Euro NCAP), integrating elements of planning, and controls.
- Designed and tested algorithms in C/C++ at L3 Level using GTest for ARS-5th Gen and SRR, ensuring reliability with QAC compliance and version control via Git/GitHub.
- Automated simulation scenario generation in Carmaker IPG by scripting in Python, significantly reducing manual efforts and streamlining testing processes by 85%.
- Provided problem-solving solutions to customer-reported bugs in the simulation environment
- Enhanced pedestrian detection performance to 87% for Mercedes Benz VS30 platform vans by fine-tuning Region of Interest (ROI) parameters, significantly improving feature representation.
- Analyzed recorded vehicle data, extracting and preprocessing frames for model training. Developed machine learning model using HOG (Histogram of Gradients) and SVC in Python using Tensorflow, transitioning successful models into real-time inference through implementation in C++ on vehicle ECUs.
- Collaborated with a dynamic team to conduct in-depth data analysis and utilized data mining techniques in Python and Tableau, providing valuable insights into client's sales data.
- Conducted behavioral segmentation of 500,000+ users, identifying key patterns in user engagement, temporal trends, and conversion rates between free and paid users, leading to an 8% increase in customer retention.
- Formulated data-driven recommendations and compelling narratives and communicated to our client, resulting in a 10% uplift in paid user conversions.
- Developed and deployed a high-performing multi-label classification model using BERT, Flask, and AWS EC2 to automatically categorize NLP research papers, improving categorization accuracy (micro F1 score) by 20%, and streamlined the research process, enabling senior scientists to identify key research papers 30% faster.
- Assisted in reviewing literature and research papers authored by senior scientists, contributing to the integration of cutting-edge insights into their ongoing work.
Projects

Generating medical reports from chest X-ray images using vision language models
- Tools: PyTorch, AWS VPM, deployed using Streamlit.
- Designing and implementing a scalable solution for generating medical reports from chest X-ray images. Employing BioViLT for image features extraction and developing an alignment model to align image features with text data. Fine-tuning a Large Language Model (BioGPT) to produce accurate, context-aware medical reports, optimizing workflows for diagnostic efficiency and reliability. Optimized the model, and leveraged maximum GPU utilization to ensure peak performance and efficiency in processing.

User-friendly software product designed to democratize machine learning.
- Tools: Python, Flask, HTML and CSS.
- Leading the development of EzFlow.ai, a user-friendly platform designed to empower users with no coding experience to learn and implement machine learning projects.
- By automating data preprocessing, model training, and result visualization, EzFlow.ai provides users with predictions and comprehensive summary reports.

Model to Generate music using LSTM neural networks.
- Tools: Python, Keras, Tensorflow
- Implements an Artificial Music Generator using LSTM (Long Short-Term Memory) networks, a type of recurrent neural network (RNN).
- The system generates music character by character based on a given input dataset.
- LSTM model is trained on a corpus of music data and then sampled from the trained model to generate new music compositions.

An end-to-end deep learning approach for autonomous car steering control using Convolutional Neural Networks (CNNs).
- Tools: TensorFlow
- The model used is the DAVE-2 end-to-end CNN Model developed by Nvidia with some modifications, it is designed to map raw pixel data from a front-facing camera directly to steering commands, enabling self-driving functionality. Unlike traditional approaches that require manual decomposition of tasks such as lane detection, semantic abstraction, path planning, and control, the DAVE-2 system learns to perform these tasks automatically from human steering angle data.
Aims to identify duplicate questions using natural language processing
Employs advanced NLP and ML to analyze Amazon reviews
- Tools: Python, Flask, AWS-for deployment.
- Developed a model to predict if a text review of a product given by user is positive or negative
- Performed extensive text cleaning and featurizing text data, achieved an AUC score of 0.90 using SGD Classifier.
- Deployed using Flask on AWS EC-2 virtual machine.
Research Publications

Facial Feature Extraction and Emotional Analysis Using ML

Performance Comparison of Prediction Algorithms for Forecasting of Wind Power Generation
- Research compares ARIMA, SARIMAX, and ARMA algorithms for Wind power generation forecasting.
- ARIMA identified as the most accurate with the lowest MSE (523.01).
- Accurate forecasting minimizes errors, enhances power grid reliability.
- Click on the link to view the IEEE paper for detailed information.
Skills
Languages and Databases






Libraries and Tools


















Frameworks


Other





Education
Washington DC, USA
Degree: Master of Science in Data Science
CGPA: 3.95/4.0
- Data Mining
- Natural Language Processing
- Cloud Computing
- Machine Learning
Relevant Courseworks:
Bangalore, India
Degree: Bachelor of Engineering in Electrical and Electronics Engineering
GPA: 7.68/10
- Programming in C and Data Structures
- Engineering Mathematics
- Engineering Physics
- Object Oriented Programming Using C++
- Python Application Programming
Relevant Courseworks:
Blogs
- 1. Logistic Regression’s Journey with Imbalanced Data
- - Read on TowardsAI: Logistic Regression’s Journey with Imbalanced Data
- 2. Unleashing the Power of Skrub: Revolutionizing Table Preparation for Machine Learning
- - Explore on Stackademic: Unleashing the Power of Skrub: Revolutionizing Table Preparation for Machine Learning
- 3. CarMaker by IPG Automotive: An innovative tool for the development and validation of vehicles
- - Explore on Medium: CarMaker by IPG Automotive: An innovative tool for the development and validation of vehicles
- 4. Unraveling the World of ADAS: Enhancing Automotive Safety and Comfort
- - Explore on Medium: Unraveling the World of ADAS: Enhancing Automotive Safety and Comfort
- 5. Clustering unveiled: The Intersection of Data Mining, Unsupervised Learning, and Machine Learning
- - Explore on Medium: Clustering unveiled: The Intersection of Data Mining, Unsupervised Learning, and Machine Learning
- You can visit my medium account here:
- - Checkout MEDIUM