Cureus. 2025 Jan;17(1): e76745
Background Birth control methods (BCMs) are often underutilized or misunderstood, especially among young individuals entering their reproductive years. With the growing reliance on artificial intelligence (AI) platforms for health-related information, this study evaluates the performance of ChatGPT-4o and Google Gemini in addressing commonly asked questions about BCMs. Methods Thirty questions, derived from the American College of Obstetrics and Gynecologists (ACOG) website, were posed to both AI platforms. Questions spanned four categories: general contraception, specific contraceptive types, emergency contraception, and other topics. Responses were evaluated using a five-point rubric assessing Relevance, Completeness, and Lack of False Information (RCL). Overall scores were calculated by averaging the rubric scores. Statistical analysis, including the Wilcoxon Signed-Rank test, Friedman test, and Kruskal-Wallis test, was performed to compare metrics. Results ChatGPT-4o and Google Gemini provided high-quality responses to birth control-related queries, with overall scores averaging 4.38 ± 0.58 and 4.37 ± 0.52, respectively, both categorized as "very good" to "excellent." ChatGPT-4o demonstrated higher scores in the lack of false information, based on descriptive statistics (4.70 ± 0.60 vs. 4.47 ± 0.73), while Google Gemini outperformed in relevance, with a statistically significant difference (4.53 ± 0.57 vs. 4.30 ± 0.70, p = 0.035, large effect size). Completeness scores were comparable (p = 0.655). Statistical analyses revealed no significant differences in overall performance (p = 0.548), though Google Gemini demonstrated a potential trend of stronger performance in the "Other Topics" category. Within-model variability showed ChatGPT-4o had more pronounced differences among metrics (moderate effect size, Kendall's W = 0.357), while Google Gemini exhibited smaller variability (Kendall's W = 0.165). These findings suggest that both platforms offer reliable and complementary tools for addressing knowledge gaps in contraception, with nuanced strengths that warrant further exploration. Conclusions ChatGPT-4o and Google Gemini provided reliable and accurate responses to BCM-related queries, with slight differences in strengths. These findings underscore the potential of AI tools, in addressing public health information needs, particularly for young individuals seeking guidance on contraception. Further studies with larger datasets may elucidate nuanced differences between AI platforms.
Keywords: artificial intelligence; birth control methods; chatgpt-4o; contraception; google gemini; health information