Search by job, company or skills

hackajob

Python Developer

3-5 Years
new job description bg glownew job description bg glownew job description bg svg
  • Posted 14 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Senior Python Engineer / LLM Evaluation Part-Time, Remote

We are hiring experienced Python engineers for part-time, task-based work focused on evaluating and testing Large Language Models (LLMs).

This is not traditional QA and not junior AI labeling work. We are looking for senior engineers who can reason deeply about system behavior, ambiguity, and real-world usage not just write code.

What You'll Do

Design structured test cases that simulate real human workflows

Define gold-standard outputs and expected behaviors

Analyze LLM failure modes such as hallucinations, bias, and context limitations

Work directly with Git repositories and existing codebases

Navigate incomplete documentation and ambiguous requirements

Apply engineering judgment to determine what good looks like

Who You Are

3+ years of software development experience (Python-focused)

Python is your primary language

Strong hands-on Git experience in real projects

Comfortable reading and debugging code you didn't write

Able to reason about edge cases, trade-offs, and ambiguity

Strong written and spoken English (B2+)

Nice to Have

QA or structured testing experience (must be code-capable)

Experience evaluating AI or LLM systems

Familiarity with evaluation metrics such as precision, recall, coverage

Experience working with Docker

Consulting or freelance engineering background

What We're Looking For

We value engineers who can explain why something fails not just that it fails. If you naturally think in terms of scenarios, assertions, failure modes, and user expectations, you'll thrive here.

This role suits senior backend Python engineers, ML engineers who still code regularly, and technically strong evaluators with real production experience.

Fully remote. Flexible schedule. Task-based delivery.

If you're interested in applying your engineering judgment to real-world AI system evaluation, we'd love to hear from you.

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 143833967

Similar Jobs