Name: Hanieh Haeri

Profile: A Seasoned Data Scientist and Machine Learning Enthusiast

Email: hhaeri0911@gmail.com

Current Location: Alamo, California, United States

Resume

Skills

Python 100%
Machine Learning 100%
Computer Vision 90%
Large Language Models 80%
Statistics 100%
Data Visualization 100%
Data Structures & Algorithms 90%
Databases 90%
Cloud & Containerization 70%
About me

Hello, I am Hanieh!

I am a Seasoned Data Scientist based in San Francisco Bay Area with a PhD in Engineering from University of California, Davis. I am highly passionate about applying cutting-edge technologies to drive data-driven decisions and craft intelligent business strategies within various industries. My genuine interest revolves around bridging the industry and AI, focusing on the development of interpretable and responsible AI models. I'm interested in exploring applications of vision models and generative AI, I’m motivated by the challenge of applying these advanced technologies to solve real-world problems, where they can make a meaningful impact on people’s lives.

I specialize in integrating AI and big data within modern geospatial technology to solve complex spatial challenges. My work involves designing scalable data pipelines, building predictive models, and developing intelligent geospatial applications that support strategic decision-making. With a strong foundation in data science and a deep passion for Earth science, I’m especially motivated by projects that explore the environmental and societal impacts of spatial data. Whether it's analyzing urban infrastructure, or resource distribution, I aim to deliver insights that drive sustainable and equitable outcomes. As modern geospatial technology takes center stage across industries, I’m committed to shaping its future through machine learning, innovation, and a strong sense of social and environmental responsibility.

Interests:

Machine Learning, Deep Learning, Data Science, Computer Vision Models, Generative Models.

Skills:

Languages: Python, SQL, MATLAB, PySpark

Data & ML Frameworks: Scikit-Learn, Spark, PyTorch, Keras, TensorFlow, Pandas, NumPy, SciPy, SQL, Tableau

Cloud & Containerization Platforms: Amazon Web Services

Web development: Flask, Streamlit, CSS, HTML

Experience

Freelance AI/ML Consultant, 2023- present

  • Interpretable AI for Medical Imaging Data
    • Enhanced a Generative model using a Variational Autoencoder (VAE) architecture to provide interpretable AI-driven diagnostics for medical imaging while collaborating with UCSF.
    • Identified user needs through feedback from medical practitioners, focusing on a simple, code-free tool to improve model understanding and build trust.
    • Developed a Streamlit app prototype for visual interaction with the VAE model’s latent space, boosting user engagement and confidence in the model's predictions.
  • Semantic Segmentation for Extracting Geologic Features from Historic Topographic Maps
    • Developed and trained a UNet-based Semantic Segmentation model in TensorFlow, reducing manual feature extraction efforts by 54% for large-scale geologic data processing.
    • Delivered a scalable, high-efficiency AI-driven solution that supported the business expansion goals and long-term ROI.
  • Causal Machine Learning for Coupon Campaign Optimization
    • Leveraged Double/Debiased Machine Learning (DML) to identify high-impact customer segments for coupon targeting, using observational data eliminating need for costly A/B testing experiments
    • Improved pricing and promotion campaign ROI by 48% versus traditional ML by accurately isolating causal effects from historical data

Montgomery & Associates - Engineering Consultant, 2014- 2022

• Led the integration of Python and SQL workflows to eliminate data processing bottlenecks, reducing processing time by 60% and empowering teams to make faster, data-driven decisions.

• Acquired, cleaned, and transformed data from diverse sources, addressing quality issues to ensure reliability and delivering analytical insights aligned with clients’ strategic goals and operational needs.

• Developed custom data pipelines to preprocess large datasets for accurate analysis while meeting client requirements.

• Leveraged physics-based and data-driven modeling, predictive modeling, experimental design, statistical analysis and machine learning to identify, investigate, integrate and deliver clear, actionable, and timely data-driven decisions to stakeholders of varying technical experience.

• Leveraged machine learning outcomes to evaluate and recommend groundwater management strategies. Delivered robust, sustainable decision-making support that served clean water to 1M+ residents in Northern California.

• Produced informative and visually compelling geospatial analysis products, maps, and dashboards to effectively communicate insights and key metrics to clients, driving actionable outcomes.

• Collaborated with cross-functional teams to translate data-driven insights into actionable recommendations.

• Scripted SQL and Python codes for efficient data wrangling, web scraping, exploratory data analysis and data visualization.

• Defined key performance metrics to provide business insights and to quickly and accurately provide stakeholders with relevant data and reports that drive decision-making processes.

Projects

Interpretable AI for Medical Imaging Data

Focus of this research is to provide interpretable deep learning models for processing and classification of medical imaging data. The current research efforts are directed towards the refinement of multiple regularization techniques tailored for enhancing the performance of a novel VAE architecture named CLAP.

Causal Machine Learning for Coupon Campaign Optimization

Leveraged Causal Machine Learning to identify customer groups for maximizing coupon campaign ROI. This approach yielded a 48% higher return on pricing and promotion investments compared to traditional experimentation methods like A/B testing.

Interpreting an Image Classifier

Developed and implemented an image classifier interpretation tool. The pipeline encompassed utilizing a pre-trained RESNET model to create predictions for selected images and employing model-agnostic interpretation tools via OmniXAI to delve into the underlying rationales guiding the model's decisions.

Interpreting Text Classifiers and Benchmarking Explainers

Developed and implemented a text classifier interpretation tool. The pipeline encompassed utilizing a pre-trained NLP model to peform sentiment analysis for selected movie reviews and employing model-agnostic interpretation tools via FerretXAI to delve into the underlying rationales guiding the model's decisions.

Semantic Segmentation for Extracting Geologic Features from Historic Topographic Maps

Developed and implemented a Semantic Segmenttaion model for extracting features from historic USGS maps. The pipeline encompassed utilizing a pre-trained UNet model and fine-tuning it with the available data.

Prediction of Daily County Level COVID Cases in School Age Population using Machine Learning

Designed, created, and deployed a custom time series forecasting model to predict county_level daily COVID-19 case count in the State of California using Python and Streamlit on Jupyter Notebook.

Natural Language Processing with Yelp reviews

Performed Natural Language Processing using the Yelp Review dataset on the kaggle website. Built bag-of-word and bigram models. Built a Naive Bayes model to find the most polar words. Analysed reviews of restaurants to find food bigrams

The New York Social Network Graph Project

Built a social network for New York's social elite by web scraping the New York Social Diary. Assembled the social graph with the assumption that people in the same picture are considered socially linked. Analyzed the graph and found the most popular socialites, most influential people and the most tightly coupled pairs.