AI healthcare, Alphabet Inc. Class C (GOOGL.US) unveils a major move! Open sources the world's first "AI all-purpose doctor", eliminating worries about computing power, hospitals can deploy with just one click!

date
19:47 17/01/2026
avatar
GMT Eight
Just now, a new breakthrough in AI medical technology from Google (GOOGL.US)!
Just now, there was a new breakthrough in AI healthcare from Alphabet Inc. Class C (GOOGL.US)! This time, they directly targeted the pain points in real clinical environments. For a long time, medical models have been like "students with a narrow focus," good at "reading medical records," but lacking in dealing with medical images such as CT scans, MRIs, and pathology slides. This is because they have been forced to use text logic to understand images, leading to low efficiency, high errors, and high costs. In response, Alphabet Inc. Class C has introduced the latest model MedGemma 1.5, finding a breakthrough solution. Compared to the previous MedGemma 1.5, MedGemma 1.5 has made a major breakthrough in multimodal applications, integrating: High-dimensional medical images: computer tomography scans (CT), magnetic resonance imaging (MRI), and histopathology. Longitudinal medical imaging: retrospective chest X-ray time series. Anatomical location: identification of anatomical features in chest X-rays. Medical document understanding: extracting structured data from medical laboratory reports. Alphabet Inc. Class C stated that MedGemma 1.5 is the first publicly released open-source multimodal large language model, capable of interpreting high-dimensional medical data while also having the ability to interpret generic 2D images and text. What's more, MedGemma 1.5 has only 4 billion parameters, meaning that even ordinary consumer-grade graphics cards or high-performance workstations can run smoothly. In addition, Alphabet Inc. Class C has also released MedASR, a speech recognition model specifically fine-tuned for medical speech, which can convert conversations between doctors and patients into text and seamlessly integrate with MedGemma. In simple terms, MedGemma 1.5 addresses "how to view images," while MedASR addresses "how to listen." This is not just a simple model iteration, but a systematic answer from Alphabet Inc. Class C on "how to truly bring AI into the clinic." An AI doctor who can read medical records, understand images, and hear voices clearly is about to enter every hospital. AI healthcare enters the era of multimodal In the past year, we have witnessed impressive performances of models such as GPT-5 in medical exams, but their performance in real clinical scenarios often falls short. One important reason is the gap in information dimensions. Many medical models, including the first-generation MedGemma, are essentially "text experts" with limited understanding of images, leading to loss of diagnostic information. MedGemma 1.5 has achieved comprehensive and multi-dimensional performance improvements in medical imaging applications, significantly surpassing its predecessors. For high-dimensional medical images, MedGemma 1.5 has achieved: An increase in the accuracy of disease classification in CT scans from 58% to 61%. An increase in the accuracy of disease classification in MRIs from 51% to 65%, particularly significant progress in identifying complex anatomical structures in the brain, joints, and so on. An increase in the ROUGE-L score for full pathology description quality from an almost ineffective 0.02 to 0.49, reaching the level of the specialized model PolyPath (0.498), capable of generating clinically usable histological descriptions. For longitudinal temporal image analysis, MedGemma 1.5 has achieved: An increase in macro-accuracy on the MS-CXR-T temporal evaluation benchmark from 61% to 66%. Effective capture of lesion dynamic changes, such as determining whether pneumonia infiltration is resolving, supporting follow-up decisions. For generic 2D medical image interpretation, MedGemma 1.5 has achieved: An overall classification accuracy increase from 59% to 62% on the internal comprehensive single-image benchmark (covering X-rays, skin, fundus, and pathology slides). Indicating that the model maintains broad 2D capabilities while not sacrificing basic performance due to the addition of high-dimensional tasks. For structured medical documents, MedGemma 1.5 has achieved: An increase in the macro-average F1 score for extracting examination items, values, and units from unstructured PDFs or text from 60% to 78% (+18%). Automatic construction of structured databases, bridging the final link for integrating image-text-examination multi-source information fusion analysis. At the same time, traditional speech recognition (ASR) models tend to perform like complete novices when faced with unfamiliar medical terms, making high word error rates a burden for AI data entry by doctors. The newly released automatic speech recognition model MedASR has been fine-tuned for medical purposes, leading to a significant reduction in error rates. Researchers compared the performance of MedASR with the general ASR model Whisper large-v3. They found that MedASR reduced the error rate in chest X-ray dictation by 58% and decreased errors in dictation between different specialties by 82%. Trillion-dollar Alphabet Inc. Class C, betting on AI healthcare Alphabet Inc. Class C has a deep layout in the healthcare field, with its technological reach extending into every corner of the industry. In terms of investment, Alphabet Inc. Class C has invested in many life science companies through its venture capital and private equity departments. Among them, AI pharmaceuticals have become a key focus for Alphabet Inc. Class C, with Google Ventures investing in 28 drug development projects out of 51 healthcare investments in 2021, accounting for over half. On the collaboration side, leveraging industry-leading artificial intelligence, cloud computing, and other services, Alphabet Inc. Class C has recently entered into agreements with pharmaceutical companies and hospitals such as Bayer, Pfizer Inc., Schivea, and the Mayo Clinic to explore intelligent solutions from drug development to clinical diagnosis and treatment. Internally, in addition to Google Health, Alphabet Inc. Class C has business units focusing on different areas, including Verily and Calico, forming a diversified and powerful matrix. In particular, as a top global AI research institution, Google DeepMind has launched several scientifically significant models, including AlphaFold (protein structure), AlphaGenome (DNA regulation), C2S-Scale (single-cell), and more. DeepMind's CEO, Demis Hassabis, won the 2024 Nobel Prize in Chemistry for his contributions to AI protein structure prediction. In recent years, under the trend of large language models, Alphabet Inc. Class C has developed multiple large models for healthcare. These models not only help doctors diagnose diseases more accurately but also provide personalized health advice to patients. The Alphabet Inc. Class C team first developed Flan-PaLM, a model that challenged the U.S. Medical Licensing Examination (USMLE) and achieved a score of 67.6%, a 17% improvement over the previous best model. Subsequently, Alphabet Inc. Class C released Med-PaLM, which was featured in Nature and, judged by professional clinical doctors, demonstrated accuracy in answering practical problems nearly identical to humans. In 2023, the world's first comprehensive medical large model, Med-PaLM M, was released, and in 14 testing tasks (question answering, report generation and summarization, visual question answering, medical image classification, and genomic variant calling), it approached or exceeded the current SOTA. Last year, Alphabet Inc. Class C's Chief Health Officer Dr. Karen DeSalvo announced six advances, including the AI pharmaceutical model TxGemma, FDA-approved watch pulse stop detection features, multi-agency system "AI Joint Scientists," personalized pediatric cancer treatment models, and more. From medical imaging to drug development, from health assistants to wearable devices, Alphabet Inc. Class C is redefining the future of healthcare. This article was reprinted from the WeChat public account "Intelligent Medicine Bureau"; GMTEight by editor: Yan Wencai.