Job Description
Are you ready to shape the future of intelligence?
Nebula AI Labs is on a mission to define the technological landscape of 2026 and beyond. We are seeking a visionary Senior Generative AI & LLM Engineer to lead the development of next-generation autonomous agents and advanced Retrieval-Augmented Generation (RAG) systems.
In this high-impact role, you won't just be building models; you will architect the brain of our ecosystem. If you thrive on complexity, possess deep expertise in Large Language Models, and want to work at the bleeding edge of artificial intelligence, we want to meet you.
Why Join Us?
- Impactful Work: Directly influence how millions interact with AI in the coming years.
- Top-Tier Talent: Collaborate with PhDs and industry veterans in a state-of-the-art facility.
- Future-Proof Your Career: Master the skills that will define the industry in 2026.
Key Responsibilities:
- Architect and deploy scalable RAG pipelines to integrate private enterprise data with LLMs.
- Develop and fine-tune Foundation Models using PyTorch and Hugging Face ecosystems for specific domain applications.
- Design and implement AI Agents capable of autonomous reasoning and multi-step tool execution.
- Optimize model inference latency and cost using quantization, pruning, and distributed serving (e.g., vLLM, TensorRT).
- Establish robust evaluation frameworks and metrics to monitor model hallucinations and performance drift.
- Stay ahead of the curve by researching emerging architectures (e.g., Mixture of Experts, State Space Models) for the 2026 roadmap.
Qualifications:
- Bachelor’s degree in Computer Science, Mathematics, or a related field; Master’s or PhD preferred.
- 5+ years of experience in software engineering with a strong focus on Machine Learning or Deep Learning.
- Deep proficiency in Python and modern AI frameworks (PyTorch, TensorFlow, JAX).
- Proven track record of working with open-source LLMs (Llama 3, Mistral, Falcon) and fine-tuning methodologies (QLoRA, LoRA).
- Hands-on experience with vector databases (Pinecone, Milvus, Weaviate) and orchestration tools (LangChain, LlamaIndex, AutoGen).
- Experience deploying AI applications to production environments (AWS, GCP, or Azure).
- Strong understanding of prompt engineering and advanced prompt optimization techniques.
Benefits: Comprehensive health coverage, equity stake, remote-first flexibility, and continuous learning stipends.
Responsibilities
- Architect and deploy scalable RAG pipelines to integrate private enterprise data with LLMs.
- Develop and fine-tune Foundation Models using PyTorch and Hugging Face ecosystems.
- Design and implement AI Agents capable of autonomous reasoning and multi-step tool execution.
- Optimize model inference latency and cost using quantization, pruning, and distributed serving.
- Establish robust evaluation frameworks and metrics to monitor model hallucinations.
- Research and prototype emerging architectures for the 2026 roadmap.
Qualifications
- Bachelor’s degree in Computer Science or related field; Master’s or PhD preferred.
- 5+ years of experience in software engineering with a focus on Machine Learning.
- Deep proficiency in Python and AI frameworks (PyTorch, TensorFlow).
- Proven track record with open-source LLMs and fine-tuning methodologies (QLoRA).
- Experience with vector databases and orchestration tools (LangChain, LlamaIndex).
- Experience deploying AI applications to production environments (AWS, GCP).
- Strong understanding of prompt engineering and optimization techniques.