Sami Moustachir




French Data scientist with 5+ years experience in NLP and Big data.

Data consultant specialized in machine learning and infrastructure.


Masters of Engineering Science, Ecole Nationale Supérieure des Mines de Nancy, Nancy, France, 2013-2016
Relevant Courses: Statistics/Probability/Fluid Mechanics/Thermodynamics/Numerical Analysis/Informatics Exchange student at Technical University of Munich. Courses: Statiscal Modelling and Machine learning, Probabilistic Graphical Models, Machine learning for Computer Vision.

Preparatory Classes, Lycée Saint Louis, Paris, 2010-2013
Three-year, intensive, post-high school courses in advanced and applied mathematics, physics and computer science to prepare a national examination for the "Grandes Ecoles".

Professional Employment

Consultant Data Scientist - Singapore, Septembre 2019 - now

  • Machine Learning for Thales Digital Factory.
  • Infrastructure at Xendit
  • Built a streaming application to process real time events and compute optimizations. (Scala, Kafka, Spark Streaming)
  • Built a machine learning application to process time series data. (Python, Spark)

Founder in Residence Entrepreneur First & Lead Data Science Instructor at General Assembly - Singapore, Jan 2019 - Septembre 2019

  • Entrepreneur First is the world’s leading talent investor.
  • Developed a prototype for understanding defects in the Construction industry.
  • Instructing for 3 months the Data Science Immersive Program.
  • Technologies: Python, SQL, Mongo, NodeJS, React, GCP

Data Scientist, French Innovation Fellow Ministry for Higher education and Research - Paris, Jan 2018 - Dec 2018

  • Working on extracting relevant information for higher education and research with NLP.
  • Building a graph database to better handle queries from a search engine (
  • Built a scalable ETL engine to handle the process of massive data feed.
  • Developed a POC based on a bi-directional attention flow to classify documents based on research concepts.
  • Technologies: AllenNLP, Neo4J, Python, Airflow

Data Scientist AXA DIL, Paris, Jul 2017 - Dec 2017

  • Worked on Fraud Detection in the insurance market using data science with Apache Spark and Cloudera Hue.
  • Technologies: Python, Scala, Spark, Hadoop

Data Scientist Mention, Paris, Oct 2016 - April 2017

  • Worked on a neural probabilistic language model.
  • Extended gensim implementation to work on multilingual aligned corpus to boost sentiment analysis accuracy on rare languages.
  • Investigated clustering methods(K-Means, Hierarchical Clustering, DBSCAN...) using word embeddings for tagging.

Data Engineer Stratagem Technologies, London, Nov 2015 - March 2016

  • Worked in the Trading System Engineering team on the execution algorithms.
  • Built an asynchronous engine to have real time data using third party providers.
  • Worked on parallelizing data processing with Spark.
  • Wrote a connector to IPython to visualize backtest results with Bokeh and Pandas.
  • Technologies: Python, SQL, Mongo, Cassandra, AWS, API, Spark

Associate Program Techstars, London, July 2015 - Oct 2015

  • Hacking growth for Techstars London '15 companies. Python and data enthusiast.
  • Scrapping data on the web and probably responsible of spamming thousand of inboxes.
  • Going through a ton of API for integration and ending up emailing the companies to fix their bugs in their API.
  • Developing an iOS app as a side project to never miss again your tube stop, soon to be released!

Technology Applications and Expertise

Specialized Skills
Data Analysis, Algorithmic, Machine learning, Mathematics, Computer Vision, DevOps, GIT
Extended knowledge in Python and Scala, good knowledge in Swift and curious in any programming languages with a good documentation

Achievements & Awards

Member of the winning team at Startup Weekend Paris: Makers Edition with Foreplay
Third prize at BT Hackathon with a Slack bot to receive voice messages
Winning team of Catapult System Intelligent Transport hackathon with Tubester, an app to warn you when to go out in the tube