Descripción de la oferta
QA STEM Expert (Physics, Chemistry, Biology)We are hiring a QA STEM Expert to serve as a high-caliber quality gatekeeper for AI-generated responses to complex STEM questions. This is not traditional academic proofreading. You will evaluate whether questions truly challenge frontier AI models and whether the model outputs meet rigorous standards of factual accuracy, logical soundness, and scientific validity.This role is ideal for a deep-thinking scientist who enjoys stress-testing AI capabilities, identifying subtle reasoning flaws, and shaping high-quality benchmarks in AI model evaluation.Key ResponsibilitiesReview and refine STEM questions (Physics, Chemistry, Biology) to ensure they effectively test conceptual depth and challenge AI models.Evaluate AI-generated responses for:Factual accuracy and scientific correctnessLogical consistency and soundness of reasoningStep-by-step derivation accuracy (including math, units, and terminology)Absence of hallucinations, flawed assumptions, or incomplete explanationsAssess whether a question successfully \"stumps\" the model and classify failure modes (reasoning-based, knowledge-based, interpretation, computational, etc.).Apply strict evaluation rubrics and guidelines while providing structured, actionable feedback.Document evaluations, error categorizations, and scoring reports with high clarity.Collaborate with AI teams and fellow STEM experts to improve question design, rubrics, and overall benchmarking standards.Required QualificationsMaster’s or PhD in Physics, Chemistry, Biology, or a closely related STEM field.4+ years of relevant experience in academic research, teaching, assessment, or technical evaluation.Strong conceptual mastery in at least one core STEM domain (multi-domain expertise is highly preferred).Exceptional critical thinking skills with the ability to dissect complex reasoning chains.Excellent written English and structured analytical communication.Preferred QualificationsPrior experience evaluating AI/LLM outputs or participating in model benchmarking.Familiarity with rubric-based grading, prompt engineering, or AI testing frameworks.Exposure to logic validation or red-teaming complex systems.Core CompetenciesDeep scientific rigor and intellectual skepticismAnalytical precision in identifying subtle errorsStructured thinking and guideline adherenceRoot-cause analysis of model failuresIntellectual curiosity and commitment to truth-seekingWant to Stand out Mail me your CV :- “Please note that the organization follows a strict no-fee recruitment policy and does not request any payment from candidates at any stage of the hiring process, including through third-party agencies or consultants. Candidates are advised to immediately report any such payment requests in connection with employment opportunities, as they are unauthorized and fraudulent.”