Internship September 2026 - Agentic AI for Semi-Automating the Data Science Workflow
Ekimetrics ยท Paris
Ekimetrics ยท Paris
Ekimetrics is the European leader in data science, with over 500 data scientists and 1,000+ projects since 2006. With offices in Paris, London, New York, and Hong Kong, we conduct projects in over 50 countries across various sectors, including financial services, retail, telecom, healthcare, and more.
Our mission is to help companies audit their data opportunities, enhance their analytical capital, and deploy actionable solutions that maximize their marketing and operational performance while reinvigorating business models.
Our absolute focus is on delivering short-term gains while ensuring the long-term development of our clients' data capital. We are committed to offering the most advanced data science approaches and building ethical, sustainable AI practices.
Key Figures
18 years of experience in data science
500+ data scientists, all consultants
5 offices in Paris, London, New York, Hong Kong, and Shanghai
350+ clients (CAC40, Fortune500)
$1M+ in profit generated for our clients since 2006
1,000+ data science projects
Within Ekimetrics, the Innovation Department works on AI research topics in collaboration with our industrial and academic partners. The department comprises several PhD experts in generative AI, deep learning, computer vision, time series, explainability, and causality. Two CIFRE PhD projects are underway, with two more to start in 2025. Each expert leads teams tasked with testing state-of-the-art algorithms and adapting them to specific business problems, developing new methodologies or algorithms to address identified challenges, and ensuring a seamless handover for integration within Ekimetrics' industrial ecosystem.
A central focus of the Innovation Department is closing the gap between cutting-edge AI agents and the day-to-day reality of delivering data science at scale. As large language models reshape how software and analytics are produced, the department is investing in agentic systems that can accelerate the full data science lifecycle, from data understanding and feature engineering through modeling, evaluation, and deployment, while keeping data scientists firmly in control of quality, governance, and business relevance. This internship sits at the heart of that effort, combining applied AI research with production-grade engineering on Ekimetrics' modern data stack.
In recent years, foundation models and LLM-based agents have transformed software engineering and knowledge work. Yet the data science workflow itself (problem framing, data preparation, feature engineering, modeling and experimentation, evaluation, deployment, and monitoring) remains largely manual, expert-intensive, and difficult to scale. A typical project follows a recognizable sequence of steps, which means it can be explicitly structured as a workflow that a multi-agent system can accelerate, with human review at the points where judgment matters most.
This internship will focus on designing and building such a system: a multi-agent orchestration in which specialized agents (e.g., feature engineering, modeling, evaluation) collaborate under a coordinating agent to produce runnable code that a data scientist reviews, refines, and merges. The system is intentionally semi-automated: some steps run human-in-the-loop with explicit approval gates, while others run autonomously with human review of the output. The intern will start with the feature engineering and modeling/experimentation loop, the part of the workflow where iteration is most intensive, and progressively extend coverage toward the end-to-end pipeline.
The work builds on recent state-of-the-art research in agentic data science, including AIDE [1], which frames machine learning engineering as a tree search over candidate code solutions; Data Interpreter [2], an end-to-end LLM agent that decomposes a data science task into a dynamic dependency graph; and AutoKaggle [3], a multi-agent pipeline whose phase-by-phase design and validated toolkit allow users to intervene at each stage, integrating automated intelligence with human expertise.
This internship offers an opportunity to engage with cutting-edge AI research and production engineering.
Key Responsibilities :
Working alongside senior data scientists and AI experts who lead the system's overall design and architecture, the intern will:
Conduct a focused literature review on agentic data science, automated machine learning, and LLM-agent techniques (planning, tool use, multi-agent orchestration), including human-in-the-loop and evaluation methods.