Assessing Ai Chatbots As Teachers: A Pilot Study of Pedagogical Competencies For Chatgpt, Gemini Ai, and Claude
This pilot experimental study evaluates the pedagogical competencies of three leading AI chatbots—ChatGPT, Gemini AI, and Claude—through the lens of established educational frameworks and standards. Conducted with 14 pre-service teacher students in a Master of Education program, the research employs a within-subjects design to assess AI assistants' alignment with key teaching dimensions derived from Romania’s Teacher Profile, Australia’s AITSL Standards, and OECD Education 2030. A novel rubric, operationalizing five core competencies (cognitive scaffolding, adaptability, formative feedback, accuracy, and academic integrity) was developed to quantify AI performance. Results indicate Claude outperformed competitors across most metrics, particularly in promoting student agency and personalized learning, while ChatGPT scored lowest. However, all models demonstrated significant ethical limitations, scoring below 3/5 on academic safety. The study highlights AI’s potential as pedagogical partners while underscoring critical gaps in ethical implementation. As an initial phase of a broader research, this work establishes foundational metrics for AI teacher evaluation, with future directions targeting expanded participant diversity, cross-disciplinary prompts, and longitudinal student outcome analysis. Findings contribute to the emerging discourse on human-AI teaching collaboration, emphasizing the need for context-specific benchmarks beyond traditional accuracy metrics. words.
