Work
  • Solita
    Data Scientist/Data Engineer
    Mar 2025 - Current

    I'm a Data Scientist/Data Engineer at Solita where I design and develop data pipelines and workflows.

    ✦ Built and orchestrated end-to-end data pipelines from ingestion to dbt transformations.
    ✦ Implemented cloud infra with Terraform and CI/CD pipelines.
    ✦ Developed Flask APIs to serve data across internal apps.

  • Braive, digital therapy platform
    Data Scientist
    Aug 2024 - Feb 2025

    ✦ Analyzed app usage data to improve onboarding funnel and retention of users.
    ✦ Mapped business needs to events tracking in the Braive app, enabling PM and leadership visibility into the product.
    ✦ Owned data in AI feature sprints to automate the clinicians work.

  • RISE Research Institutes of Sweden
    Master Thesis Intern
    Jan 2024 - Jun 2024

    ✦ Designed, built, and trained data synthesis systems for autonomous driving.
    ✦ Built data generation pipeline for rare driving scenarios.
    ✦ Worked as an ML engineer in a project between RISE and Smart Eye.

  • KBLab, National Library of Sweden
    Data Scientist
    Jan 2023 - Dec 2023

    ✦ Built Evals for the curation of data used for training Swedish LLMs such as KBWhisper.
    ✦ Constructed workflows for digitizing text materials, with OCR on PDFs and text segmentation using pre-trained models on Huggingface.
    ✦ Curated large national datasets, including the parliamentary proceedings, helping with transparency to the Swedish government.

Education
  • Uppsala University
    MSc - Machine Learning (Aug 2022 - Jun 2024)
    BSc - Mathematics and Statistics (Aug 2018 - Jun 2022)

    In parallel to my studies, I also served as a lecturer and held workshops in Statistics within experimental design and predictive models for data science undergrads.

    I also developed a few projects:

    ✦ A real-time pipeline in Kafka/Spark on UPPMAX clusters.
    ✦ Implemented a Bayesian sports ranking algorithm in Python.
    ✦ Benchmarked and tuned clustering algorithms on real-world datasets.