PlusAI is a Physical AI company pioneering AI-based virtual driver software for factory-built autonomous trucks. Headquartered in Silicon Valley with operations in the United States and Europe, Plus was named by Fast Company as one of the World's Most Innovative Companies. Partners including TRATON GROUP's Scania, MAN, and International brands, Hyundai Motor Company, Iveco Group, Bosch, and DSV are working with Plus to accelerate the deployment of next-generation autonomous trucks. If you're ready to make a huge impact and drive the future of autonomy, Plus is looking for talented individuals to join its fast-growing teams.
We're seeking an enthusiastic and driven Simulation/ML Engineer Intern to join our team and work on an exciting project that blends Large Language Models (LLMs) with simulation technology. In this role, you'll help develop a tool that can generate realistic, scalable simulation scenarios from text and real road data. This is a fantastic opportunity to apply your machine learning and robotics knowledge to real-world challenges, while working on a project that will revolutionize how simulation scenarios are created, with minimal manual effort. You'll be at the forefront of innovation, helping us expand our capabilities in testing autonomous vehicles using large-scale simulation with LLM-driven solutions.
Your opportunities joining PlusAI
Work, learn and grow in a highly future-oriented, innovative and dynamic field.
Wide range of opportunities for personal and professional development.
Catered free lunch, unlimited snacks and beverages.
Highly competitive salary and benefits package, including 401(k) plan.
Responsibilities::
• Build an AI Assistant: Develop and deploy an internal AI chatbot that allows employees to query company knowledge and test results using natural language.
• Implement RAG Architecture: Design and build a secure Retrieval-Augmented Generation (RAG) pipeline to pull contextual data from internal sources without compromising data privacy.
• Develop Data Pipelines: Create automated pipelines to ingest, clean, and structure data from diverse sources, including internal documents, Slack conversations, and autonomous driving databases (bagdb, pluscene, and right-seater logs).
• Fine-Tune Open-Source LLMs: Work with open-source models (such as Qwen) and fine-tune them to accurately understand and process company-specific terminology and AV testing metrics.
• Generate Actionable Insights: Enable the system to synthesize complex data across simulation and road tests to answer questions about passing rates, test mileages, coverage gaps, and testing recommendations.
Required Skills::
• Machine Learning & NLP: Solid understanding of Large Language Models (LLMs), natural language processing, and prompt engineering.
• Python Programming: Strong proficiency in Python for machine learning workflows, scripting, and backend system integration.
• Data Engineering Fundamentals: Experience building data extraction, transformation, and loading (ETL) pipelines, as well as handling both structured and unstructured data.
• Familiarity with RAG: Core understanding of Retrieval-Augmented Generation workflows, text chunking, and vector embeddings.
Preferred Skills::
• Open-Source LLM Experience: Hands-on experience deploying, fine-tuning, or quantizing open-source models (e.g., Qwen, LLaMA, Mistral) using frameworks like Hugging Face or vLLM.
• Vector & Relational Databases: Experience working with vector databases (e.g., Milvus, Chroma, FAISS) as well as querying traditional SQL/NoSQL databases.
• Autonomous Vehicle Domain Knowledge: Familiarity with autonomous driving data formats (e.g., ROS bags), simulation environments, or road testing metrics.
• Chatbot Frameworks: Experience with LLM orchestration frameworks such as LangChain or LlamaIndex.
• Data Security & Privacy: An understanding of best practices for deploying ML models locally or within secure, internally-hosted environments.