Rhesis AI logo

Working Student – AI Engineering (20h/week)

Rhesis AI
Part-time
Remote
Germany
Technology & Development
Your mission
Join us at Rhesis AI – Open-source testing platform & SDK for LLM and agentic applications.

At Rhesis AI, we empower organizations to develop and deploy LLM and agentic applications that meet high standards for reliability, robustness, and compliance. As the creators of an open-source solution for test generation, management & execution, we enable AI teams to build context-specific tests and collaborate directly with domain experts. If you're passionate about LLMs, evaluation methods, and advancing trustworthy AI through practical tools, we invite you to join our mission.

Your profile
WHAT YOU WILL DO
  • Help design and implement features to assess the performance, safety, and quality of LLM and agentic applications.
  • Experiment with prompt engineering, model behaviors, and evaluation metrics to improve test generation and coverage.
  • Contribute to building and curating datasets for AI testing scenarios across different domains and use cases.
  • Explore and prototype new approaches to LLM evaluation, drawing on current research and emerging best practices.
  • Collaborate with our engineering team to translate experimental findings into production-ready features.
YOU ARE GREAT FOR THIS ROLE IF YOU
  • Are currently enrolled at a German university (Immatrikulationsbescheinigung required) and based in Germany.
  • Are currently pursuing a Bachelor's or Master's degree in Computer Science, Data Science, AI, Machine Learning, or a related field.
  • Have solid hands-on experience with Python and are comfortable working with data pipelines and scripting.
  • Are genuinely curious about large language models, prompt engineering, and how to systematically evaluate AI systems.
  • Have experience with at least one ML/AI framework (e.g., PyTorch, HuggingFace Transformers, LangChain, or similar).
  • Understand basic concepts in NLP, evaluation metrics, and statistical analysis.
  • Are comfortable using Git and working in a collaborative development environment.
  • Enjoy reading papers, experimenting with new ideas, and turning research into working code.
HOW TO APPLY

Want to get a feel for how we work? Check out our GitHub repo: https://github.com/rhesis-ai/rhesis

Please submit your application exclusively through our application portal. We're unable to consider applications sent via email or LinkedIn. This helps us ensure every application gets the attention it deserves. Please upload your current transcript showing your grades (Notenspiegel) using the "Urkunde" field. You can usually download this from your university's online portal (e.g., HIS, CampusNet, TUMonline).
Why us?
We're offering a 20h/week contract starting 1 February, along with:
  1. Work at the forefront of AI: Collaborate with innovative teams building LLM applications. Shape the open-source tools that define how AI applications are tested and validated.
  2. Deepen your AI expertise: Get hands-on experience with LLM evaluation, prompt engineering, and evals, skills in high demand across the industry.
  3. Flexible work arrangements: We support remote work and have a workspace in Potsdam (Griebnitzsee).
  4. A supportive team: We foster collaboration, mutual respect, and support your professional growth.
At Rhesis AI, we value diversity and inclusion. We encourage applications from individuals of all backgrounds. Even if you don't meet every requirement, we'd love to hear from you – your unique perspective could be exactly what we need.
About us
At Rhesis AI, we're building the open-source testing infrastructure that LLM and agentic applications need to earn trust at scale. We believe AI teams shouldn't have to choose between moving fast and shipping reliably – so we're creating tools that make evaluation collaborative, context-specific, and built into the development workflow.
Apply now
Share this job