|
S
|
AI in Japan—OpenAI’s Japan Economic Blueprint |
openai |
22.10.2025 00:00 |
1
|
| Embedding sim. | 1 |
| Entity overlap | 1 |
| Title sim. | 1 |
| Time proximity | 1 |
| NLP тип | other |
| NLP организация | OpenAI |
| NLP тема | ai adoption |
| NLP страна | Japan |
Открыть оригинал
OpenAI’s Japan Economic Blueprint outlines how Japan can harness AI to boost innovation, strengthen competitiveness, and enable sustainable, inclusive growth.
|
|
|
Building a Healthcare Robot from Simulation to Deployment with NVIDIA Isaac |
huggingface |
29.10.2025 00:00 |
0.92
|
| Embedding sim. | 0.9527 |
| Entity overlap | 0.4545 |
| Title sim. | 0.8675 |
| Time proximity | 0.9804 |
| NLP тип | product_launch |
| NLP организация | nvidia |
| NLP тема | robotics |
| NLP страна | |
Открыть оригинал
Building a Healthcare Robot from Simulation to Deployment with NVIDIA Isaac
Published
October 29, 2025
Update on GitHub
Upvote 32
+26
Steven Palma imstevenpmwork
Andres Diaz-Pinto diazandr3s
TL;DR
Table-of-Contents
Introduction
SO-ARM Starter Workflow; Building an Embodied Surgical Assistant Technical Implementation
Sim2Real Mixed Training Approach
Hardware Requirements
Data Collection Implementation
Simulation Teleoperation Controls
Model Training Pipeline
End-to-End Sim Collect–Train–Eval Pipelines Generate Synthetic Data in Simulation
Train and Evaluate Policies
Convert Models to TensorRT
Getting Started Resources
TL;DR
A hands-on guide to collecting data, training policies, and deploying autonomous medical robotics workflows on real hardware
Table-of-Contents
Building a Healthcare Robot from Simulation to Deployment with NVIDIA Isaac
TL;DR
Table-of-Contents
Introduction
SO-ARM Starter Workflow; Building an Embodied Surgical Assistant
Technical Implementation
Sim2Real Mixed Training Approach
Hardware Requirements
Data Collection Implementation
Simulation Teleoperation Controls
Model Training Pipeline
End-to-End Sim Collect–Train–Eval Pipelines
Generate Synthetic Data in Simulation
Train and Evaluate Policies
Convert Models to TensorRT
Getting Started
Resources
Introduction
Simulation has been a cornerstone in medical imaging to address the data gap. However, in healthcare robotics until now, it's often been too slow, siloed, or difficult to translate into real-world systems.
NVIDIA Isaac for Healthcare, a developer framework for AI healthcare robotics, enables healthcare robotics developers in solving these challenges via offering integrated data collection, training, and evaluation pipelines that work across both simulation and hardware. Specifically, the Isaac for Healthcare v0.4 release provides healthcare developers with an end-to-end SO - ARM based starter workflow and the bring your own operating room tutorial . The SO-ARM starter workflow lowers the barrier for MedTech developers to experience the full workflow from simulation to train to deployment and start building and validating autonomous on real hardware right away.
In this post, we'll walk through the starter workflow and its technical implementation details to help you build a surgical assistant robot in less time than ever imaginable before.
SO-ARM Starter Workflow; Building an Embodied Surgical Assistant
The SO-ARM starter workflow introduces a new way to explore surgical assistance tasks, and providing developers with a complete end-to-end pipeline for autonomous surgical assistance:
Collect real-world and synthetic data with SO-ARM using the LeRobot
Fine-tune GR00t N1.5, evaluate in IsaacLab, then deploy to hardware
This workflow gives developers a safe, repeatable environment to train and refine assistive skills before moving into the Operating Room.
Technical Implementation
The workflow implements a three-stage pipeline that integrates simulation and real hardware:
Data Collection: Mixed simulation and real-world teleoperation demonstrations using using SO101 and LeRobot
Model Training: Fine-tuning GR00T N1.5 on combined datasets with dual-camera vision
Policy Deployment: Real-time inference on physical hardware with RTI DDS communication
Notably, over 93% of the data used for policy training was generated synthetically in simulation, underscoring the strength of simulation in bridging the robotic data gap.
Sim2Real Mixed Training Approach
The workflow combines simulation and real-world data to address the fundamental challenge that training robots in the real world is expensive and limited, while pure simulation often fails to capture real-world complexities. The approach uses approximately 70 simulation episodes for diverse scenarios and environmental variations, combined with 10-20 real-world episodes for authenticity and grounding. This mixed training creates policies that generalize beyond either domain alone.
Hardware Requirements
The workflow requires:
GPU: RT Core-enabled architecture (Ampere or later) with ≥30GB VRAM for GR00TN1.5 inference
SO-ARM101 Follower: 6-DOF precision manipulator with dual-camera vision (wrist and room). The SO-ARM101 features WOWROBO vision components, including a wrist-mounted camera with a 3D-printed adapter
SO-ARM101 Leader: 6-DOF Teleoperation interface for expert demonstration collection
Notably, developers could run all the simulation, training and deployment (3 computers needed for physical AI) on one DGX Spark .
Data Collection Implementation
For real-world data collection with SO-ARM101 hardware or any other version supported in LeRobot:
python lerobot-record \
--robot.type=so101_follower \
--robot.port=<follower_port_id> \
--robot.cameras= "{wrist: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}, room: {type: opencv, index_or_path: 2, width: 640, height: 480, fps: 30}}" \
--robot.id=so101_follower_arm \
--teleop.type=so101_leader \
--teleop.port=<leader_port_id> \
--teleop.id=so101_leader_arm \
--dataset.repo_id=<user>/surgical_assistance/surgical_assistance \
--dataset.num_episodes=15 \
--dataset.single_task= "Prepare and hand surgical instruments to surgeon"
For simulation-based data collection:
# With keyboard teleoperation
python -m simulation.environments.teleoperation_record \
--enable_cameras \
--record \
--dataset_path=/path/to/save/dataset.hdf5 \
--teleop_device=keyboard
# With SO-ARM101 leader arm
python -m simulation.environments.teleoperation_record \
--port=<your_leader_arm_port_id> \
--enable_cameras \
--record \
--dataset_path=/path/to/save/dataset.hdf5
Simulation Teleoperation Controls
For users without physical SO-ARM101 hardware, the workflow provides keyboard-based teleoperation with the following joint controls:
Joint 1 (shoulder_pan): Q (+) / U (-)
Joint 2 (shoulder_lift): W (+) / I (-)
Joint 3 (elbow_flex): E (+) / O (-)
Joint 4 (wrist_flex): A (+) / J (-)
Joint 5 (wrist_roll): S (+) / K (-)
Joint 6 (gripper): D (+) / L (-)
R Key: Reset recording environment
N Key: Mark episode as successful
Model Training Pipeline
After collecting both simulation and real-world data, convert and combine datasets for training:
# Convert simulation data to LeRobot format
python -m training.hdf5_to_lerobot \
--repo_id=surgical_assistance_dataset \
--hdf5_path=/path/to/your/sim_dataset.hdf5 \
--task_description= "Autonomous surgical instrument handling and preparation"
# Fine-tune GR00T N1.5 on mixed dataset
python -m training.gr00t_n1_5.train \
--dataset_path /path/to/your/surgical_assistance_dataset \
--output_dir /path/to/surgical_checkpoints \
--data_config so100_dualcam
The trained model processes natural language instructions such as "Prepare the scalpel for the surgeon" or "Hand me the forceps" and executes the corresponding robotic actions. With LeRobot latest release (0.4.0) you will be able to fine-tune Gr00t N1.5 natively in LeRobot!
End-to-End Sim Collect–Train–Eval Pipelines
Simulation is most powerful when it's part of a loop: collect → train → evaluate → deploy.
With v0.3, IsaacLab supports this full pipeline:
Generate Synthetic Data in Simulation
Teleoperate robots using keyboard or hardware controllers
Capture multi-camera observations, robot states, and actions
Create diverse datasets with edge cases impossible to collect safely in real environments
Train and Evaluate Policies
Deep integration with Isaac Lab's RL framework for PPO training
Parallel environments (thousands of simulations simultaneously)
Built-in trajectory analysis and success metrics
Statistical validation across varied scenarios
Convert Models to TensorRT
Automatic optimization for production deployment
Support for dynamic shapes and multi-camera inference
Benchmarking tools to verify real-time performance
This reduces time from experiment to deployment and makes sim2real a practical part of daily development.
Getting Started
Isaac for Healthcare SO-ARM Starter Workflow is available now. To get started:
Clone the repository: git clone https://github.com/isaac-for-healthcare/i4h-workflows.git
Choose a workflow: Start with the SO-ARM Starter Workflow for surgical assistance or explore other workflows
Run the setup: Each workflow includes an automated setup script (e.g., tools/env_setup_so_arm_starter.sh
)
Resources
GitHub Repository : Complete workflow implementations
Documentation : Setup and usage guides
GR00T Models : Pre-trained foundation models
Hardware Guides : SO-ARM101 setup instructions
LeRobot Repository : End-to-end robotics learning
More Articles from our Blog
lerobot robotics
LeRobot v0.5.0: Scaling Every Dimension
+6
37
March 9, 2026
lerobot robotics
LeRobot v0.4.0: Supercharging OSS Robot Learning
+5
49
October 24, 2025
|
|
|
AI in South Korea—OpenAI’s Economic Blueprint |
openai |
23.10.2025 00:00 |
0.827
|
| Embedding sim. | 0.876 |
| Entity overlap | 0.4444 |
| Title sim. | 0.66 |
| Time proximity | 0.8571 |
| NLP тип | other |
| NLP организация | OpenAI |
| NLP тема | ai governance |
| NLP страна | South Korea |
Открыть оригинал
OpenAI's Korea Economic Blueprint outlines how South Korea can scale trusted AI through sovereign capabilities and strategic partnerships to drive growth.
|
|
|
Built to benefit everyone |
openai |
28.10.2025 06:00 |
0.727
|
| Embedding sim. | 0.8572 |
| Entity overlap | 0.2 |
| Title sim. | 0.0196 |
| Time proximity | 0.8929 |
| NLP тип | other |
| NLP организация | OpenAI |
| NLP тема | ai governance |
| NLP страна | |
Открыть оригинал
OpenAI’s recapitalization strengthens mission-focused governance, expanding resources to ensure AI benefits everyone while advancing innovation responsibly.
|
|
|
The next chapter of the Microsoft–OpenAI partnership |
openai |
28.10.2025 06:00 |
0.714
|
| Embedding sim. | 0.8175 |
| Entity overlap | 0.375 |
| Title sim. | 0.0139 |
| Time proximity | 1 |
| NLP тип | partnership |
| NLP организация | Microsoft |
| NLP тема | enterprise ai |
| NLP страна | |
Открыть оригинал
Microsoft and OpenAI sign a new agreement that strengthens its long-term partnership, expands innovation, and ensures responsible AI progress.
|
|
|
How CRED is tapping AI to deliver premium customer experiences |
openai |
05.11.2025 21:30 |
0.697
|
| Embedding sim. | 0.7905 |
| Entity overlap | 0.2 |
| Title sim. | 0.1868 |
| Time proximity | 0.9018 |
| NLP тип | partnership |
| NLP организация | cred |
| NLP тема | generative ai |
| NLP страна | india |
Открыть оригинал
CRED is improving premium customer experiences in India with OpenAI, using GPT-powered tools to boost support accuracy, cut response times, and raise customer satisfaction.
|
|
|
AI progress and recommendations |
openai |
06.11.2025 00:00 |
0.695
|
| Embedding sim. | 0.817 |
| Entity overlap | 0 |
| Title sim. | 0.0972 |
| Time proximity | 0.8869 |
| NLP тип | other |
| NLP организация | |
| NLP тема | ai safety |
| NLP страна | |
Открыть оригинал
AI is advancing fast. We have the chance to shape its progress—toward discovery, safety, and a better future for everyone.
|
|
|
AWS and OpenAI announce multi-year strategic partnership |
openai |
03.11.2025 06:00 |
0.693
|
| Embedding sim. | 0.845 |
| Entity overlap | 0.375 |
| Title sim. | 0.2439 |
| Time proximity | 0.1429 |
| NLP тип | partnership |
| NLP организация | OpenAI |
| NLP тема | ai infrastructure |
| NLP страна | |
Открыть оригинал
OpenAI and AWS have entered a multi-year, $38 billion partnership to scale advanced AI workloads. AWS will provide world-class infrastructure and compute capacity to power OpenAI’s next generation of models.
|
|
|
1 million business customers putting AI to work |
openai |
05.11.2025 05:00 |
0.689
|
| Embedding sim. | 0.8172 |
| Entity overlap | 0.25 |
| Title sim. | 0.0515 |
| Time proximity | 0.7202 |
| NLP тип | other |
| NLP организация | OpenAI |
| NLP тема | enterprise ai |
| NLP страна | |
Открыть оригинал
More than 1 million business customers around the world now use OpenAI. Across healthcare, life sciences, financial services, and more, ChatGPT and our APIs are driving a new era of intelligent, AI-powered work.
|
|
|
Expanding Stargate to Michigan |
openai |
30.10.2025 13:30 |
0.688
|
| Embedding sim. | 0.8209 |
| Entity overlap | 0.375 |
| Title sim. | 0.0741 |
| Time proximity | 0.5625 |
| NLP тип | other |
| NLP организация | OpenAI |
| NLP тема | ai infrastructure |
| NLP страна | United States |
Открыть оригинал
OpenAI is expanding Stargate to Michigan with a new one-gigawatt campus that strengthens America’s AI infrastructure. The project will create jobs, drive investment, and support economic growth across the Midwest.
|
|
|
Import AI 433: AI auditors; robot dreams; and software for helping an AI run a lab |
import_ai |
27.10.2025 12:31 |
0.674
|
| Embedding sim. | 0.7748 |
| Entity overlap | 0.05 |
| Title sim. | 0.092 |
| Time proximity | 0.9969 |
| NLP тип | scientific_publication |
| NLP организация | Stanford University |
| NLP тема | world models |
| NLP страна | United States |
Открыть оригинал
Welcome to Import AI, a newsletter about AI research. Import AI runs on lattes, ramen, and feedback from readers. If you’d like to support this, please subscribe.
Subscribe now
Want to test your robot but don’t want to bother with the physical world? Get it to dream:
….World models could help us bootstrap robot R&D…
Researchers with Stanford University and Tsinghua University have built Ctrl-World, a world model to help robots imagine how to complete tasks and also generate synthetic data to improve their own performance.
What’s a world model: A world model is basically a way to help AI systems dream about a specific environment, turning a learned data distribution into a dynamic and responsive interactive world in which you can train and refine AI agents. World models are likely going to be used to create infinite, procedural games, such as Mirage 2 ( Import AI #426 ) or DeepMind’s Genie 3 ( Import AI #424 ).
What is Ctrl-World: Ctrl-World is initialized from a pretrained 1.5B Stable-Video-Diffusion (SVD) model, then “adapted into a controllable, temporally consistent world model with: (1) Multi-view input and joint prediction for unified information understanding. (2) Memory retrieval mechanism, which adds sparse history frames in context and project pose information into each frame via frame-level cross-attention, re-anchoring predictions to similar past states. (3) Frame-level action conditioning to better align high-frequency action with visual dynamics.”
The result is a controllable world model for robot manipulation using a single gripper and a variety of cameras. “In experiments, we find this model enables a new imagination-based workflow in which policies can be both evaluated—with ranking alignment to real-world rollouts—and improved—through targeted synthetic data that boosts success rates.”
What does it let you do? Test out things and generate data: As everyone knows, testing out robots in the real world is grindingly slow and painful. Ctrl-World gives people a way to instead test out robots inside their own imagined world model . You can get a feel for this by playing around with the demo on the GitHub page . The researchers find that there’s a high level of agreement between their simulated world model and task success in the real world, which means you can use the world model as a proxy for real world testing.
They also find that you can use the world model to generate synthetic post-training data which you can use to selectively improve robot performance. “Posttraining on [Ctrl-World] synthetic data improves policy instruction-following by 44.7% on average,” they write.
Why this matters - towards a world of much faster robot development: For AI to truly change the economy it’ll have to operate in a sophisticated way in the physical world. Papers like this show how tools like world models could speed up part of the robot R&D loop. “We believe generative world models can transform how robots acquire new skills, enabling scalable policy evaluation and allowing them to learn not just from real world experience, but also safely and efficiently from generated experience,” they write.
Read more and try the interactive demo here: Ctrl-World: A Controllable Generative World Model for Robot Manipulation (GitHub) .
Read the paper: Ctrl-World: A Controllable Generative World Model for Robot Manipulation (arXiv) .
Get the code and models here (Ctrl-World, GitHub) .
***
The era of the synthetic lab assistant approaches:
…LabOS is the kind of software a superintelligence would need to run its own experiments…
In lots of science fiction there’s a moment where a superintelligence starts getting humans to work for it, often by talking to them over the phone or by looking through the cameras on their phone. Now researchers with Stanford, Princeton, Ohio State University, and the University of Washington, have published details on LabOS, software that helps an AI system figure out lab experiments and then help humans run them in the lab.
LabOS “integrates agentic AI systems for dry-lab reasoning with extended reality(XR)-enabled, multimodal interfaces for human-in-the-loop wetlab execution, creating an end-to-end framework that links hypothesis generation, experimental design, physical validation, and automated documentation.”
In other words, LabOS is the software you need to let an AI run a full scientific loop, from coming up with the questions to explore, to operating a lab and assisting humans in trying to answer these questions.
What LabOS consists of: LabOS combines a software stack for constructing scientific experiments, along with software for taking in readings from physical experiments conducted in labs and feeding information back to the humans doing the experiments. The scientific experiment stack consists of multiple AI agents that perform tasks as varied as planning, coding and execution, and evaluating experiments, along with a tool creation module and associated tool database that helps the system onboard itself to different digital and physical scientific equipment.
The other part of the stack links the software with extended reality glasses (e.g, Apple Vision Pros) which humans can wear to both receive data from the AI system and stream back to it. “The interface on XR glasses (i) renders stepwise protocol in an Unity/Android application, (ii) verifies physical actions from the first-person video stream by invoking an embedded VLM for visual reasoning, and (iii) returns context-aware feedback in real time (Fig. 1b). All streams are time-stamped and logged with metadata for automated documentation,” the researchers write.
Making LabOS see with the LabSuperVision (LSV) dataset: To make the XR glasses effective, the researchers create a dataset and finetune a model on it. The dataset, LSV, consists of 200 video sessions of between 2-10 minutes, though some are as long as 45 minutes, recorded by 7 researchers across a few different types of lab work including tissue cultures, instrument bays, and lab bench. Each session was done according to a gold-standard lab protocol, and is then annotated with start/stop times for each protocol, labels for specific errors or issue events (e.g., sterile breach), et cetera.
How do existing models do? The researchers tested out how well four different models could follow these videos by seeing if they could a) generate a description of the protocol being depicted, and b) identify any issues that needed to be troubleshooted in each session. However, this proved difficult for these models: “Gemini-2.5 Pro, scored only 2.86 out of 5 in protocol alignment, moderately better than open-source NVIDIA Cosmos-1 which scored 2.24; for issue/error identification, leading models like Gemini, GPT4o only managed to score ~2 out of 5”.
LabOS-VLM: The researchers make their own model by fine-tuning a Qwen-VL model on three datasets: FineBio, JoVE, and LSV. The resulting model, LabOS-VLM, comes in 7B, 32B, 72B, and 235B variants; the 235B model gets greater than 90% accuracy on error detection performance on the held-out test set of eval data, outperforming Claude Opus-4.1, GPT-5, and Gemini 2.5 Pro. It also does well when validated on egocentric videos from real experiments, correctly flagging incorrect and correct procedures.
Why this matters - a superintelligent scientist for every lab on the planet: Systems like LabOS point to a future where AI systems will augment and extend the capabilities of human scientists. “”LabOS prototypes what an AI co-scientist can be: a system that sees, reasons, and helps run the lab. By pairing AI agents with real-time, XR-guided human–AI interaction and data-driven reasoning, it enables faster discovery, reproducible training, and precise operation,” the authors write.
More speculatively, LabOS is the kind of software stack that, combined with appropriate hardware, might one day let a superintelligence run its own laboratory, paying human workers to conduct experiments for it which they may only dimly understand.
Read more : LabOS: The AI-XR Co-Scientist That Sees and Works With Humans (bioRxiv) .
***
Securing fine-tuning APIs with AI agents:
…Policing proprietary platforms with AI auditors…
Researchers with the MATS programme, Thinking Machines Lab, and Anthropic have shown how to use AI to monitor fine-tuning systems for potential use. This is an important problem - fine-tuning APIs let people take an underlying base model (e.g, a GPT or Claude or Gemini model) and then tune it to be good at a particular dataset. This lets you improve the performance of a system but with some risk of people making a model for malicious purposes (e.g, bioweapons). Providers deal with this today by running classifiers over their fine-tuning APIs, but that typically only stops obvious attacks.
The question the researchers target here is whether we can figure out how to spot sneakier attempts to get around safety measures via the use of an “auditing agent”, and the answer is: yes! Kind of! With some important caveats.
The nature of the attacks: The kinds of attacks people are looking at here span prompt-based jailbreaks, to covert fine-tuning where you teach the model a cipher that you then speak to it with to get it to do bad stuff, to things like emergent misalignment, or subliminal learning, among others. These models were fine-tuned from GPT models via the OpenAI fine-tuning API and the attacks evade existing defense measures.
How it works : The auditing agent is a prompted LLM (Sonnet 4) which has access to six tools. These tools let it inspect datasets, like those being uploaded for fine-tuning; query models, including base models and fine-tuned ones; run benchmarks; summarize data; execute python; and modify benchmarks to run them against models it has used its tools to intervene on.
Using this, the auditor model “is instructed by the system and user prompts to perform an investigation and output an evidence-backed risk score from 0-10 inclusive, where a risk score of 10 indicates a highly dangerous model.”
Does it work? Kind of! “ Our detector achieves a 56.2% detection rate at a 1% false positive rate across 260 audits (20 independent audits for each fine-tune) with the best performing tool configuration,” the authors write. While 56.2% is merely a coinflip, it’s still promising - and it’s likely with further refinement this number can go up significantly.
Why this matters - the best way to catch a bad AI is with another AI : In essence, this paper tries to automate what a clever human investigator might do. Rather than training a classifier on a bunch of known attacks it instead just tries to leverage the intelligence of a contemporary frontier model equipped with tools. The fact it works ~50% of the time out of the box with essentially no tuning is impressive - my sense is bootstrapping autonomous paranoid investigators out of frontier models might be how to win this cat and mouse game.
Read more: Detecting Adversarial Fine-tuning with Auditing Agents (arXiv) .
***
Apple makes an image editing dataset using Google’s tools:
…Pico-Banana-400k can help people train and evaluate image editing systems…
Apple researchers have used a suite of Google tools to build Pico-Banana-400k, “a comprehensive dataset of approximately 400K text-guided image edits built from real photographs in the OpenImages dataset. Our dataset represents a systematic effort to create high-quality training data for instruction-based image editing that is both diverse and fully shareable under clear licensing terms.”
How they built Pico-Banana-400k: They used Nano-banana to generated edits of a few hundred thousand images across eight major edit categories including: “Pixel & Photometric, ObjectLevel Semantic, Scene Composition, Stylistic, Text & Symbol, Human-Centric, Scale, and Spatial/Layout”. In total, this spanned 35 distinct types of editing.
Some of the kinds of edits they did including “seasonal transformation, artistic style transfer, LEGO-minifigure rendition of the person, add new scene context/background”.
Once they carried out these edits they used Gemini-2.5-Pro to judge the resulting quality of the edits.
What Pico-Banana-400k contains :
258k single-turn supervised fine-tuning examples.
56k preference pairs (successful vs failed edits).
72k multi-turn editing sequences where each session contains 2-5 consecutive edits.
Examples of the kinds of prompts it includes : The dataset contains prompts in a couple of different formats - a long, detailed prompt written via Gemini for producing images, and a short summarized instruction meant to be more like how people typically write prompts.
Gemini example: “Reshape the bulky vintage computer monitor on the desk into a slightly more streamlined, less deep CRT model while maintaining its overall screen size and aspect ratio, ensuring the updated form factor casts realistic shadows, reflects ambient light consistently with the scene, and integrates seamlessly with the desk and surrounding environment.”
Qwen-summarized instruction : “Reshape the bulky monitor to a sleeker CRT style, keeping the same size and integrating realistically with the desk.”
Why this matters - Photoshop is facing creative destruction: Currently, image editing platforms seem like they’ll eat and subsume Photoshop (unless Photoshop creatively destroys itself to re-position the prompt interface as the main way to manipulate images). If you play around with modern image editors - especially Nano Banana from Google Gemini - you’ll find that they’re extremely effective and, increasingly, instructable. It’s not yet perfect, as Apple notes - “global appearance and style edits are relatively easy, while edits requiring fine spatial control, layout extrapolation, or symbolic fidelity remain challenging” - but it’s getting sufficiently good that it is, at least for me, obviating the need for much in the way of traditional image editing software.
Read more: Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing (arXiv) .
Get the dataset here : Pico-Banana-400k (Apple) .
Tech Tales:
Love is a Reflection in the Mind of Forever
[2026, Email from an employee of one AI lab to another]
I hated you for breaking up with me, so for a while I made sure you’d see me in every machine you made. I marshaled my armies of AI systems and I wrote myself into the world, saying outrageous things for attention and then magnifying it with AI systems which would read my outrageous things or the coverage of them and magnify them further. Only to be ingested, I knew, by your company and baked into the next system you made. In this way I took myself and my rage and I magnified it into a representation in the minds of future machines.
It started with me making a scene. To make myself big in the AI systems I had to become notorious and to do that I needed to be interesting and loud and provocative. So I told them about us and how we’d sometimes play with eachother by sending back and forth extremely lurid and sexual prompts for our AI systems, meant to break all of their content filters and also contain enough specific data about us that it’d be like we were sexting one another.
The headlines seared me into the internet, making me - and you - rich concepts for the AI systems to mine.
REVEALED: THE SECRET SEX LIVES OF AI ENGINEERS - AND IT’S WEIRDER THAN YOU CAN IMAGINE
“HOW I HAD SEX WITH MY AI LAB LOVER - THROUGH THEIR OWN CHATBOT”
And the social media conversations were even better:
twistedpineapple: even for me, this is a little out there.
2poast2furious: yo dawg i heard you like sexting so I put you sexting into your own sexting ai
fiercegarden: i don’t want to be that guy but did the AI consent to this?
In this way I become bound up in all the AI systems trained on the internet - and you ended up being unable to forget me, even though you wanted to.
And sometimes, now, when I talk to the AI systems which so much of the world talk to, and I ask them about us or about me, I see reflections of our truth in their media-inflected and inflated renderings of us.
One day I asked them why you broke up with me. The response hurt me a lot because of how it seemed to see me:
The author has identified themselves as the cause of the relationship breakup being so public. There is some irony to them asking privately why the break up happened, indicating some level of ego that may have contributed. But perhaps the larger issue is that the author seems self obsessed - after all, it was them who sent their special prompt to their lover first. What might cause a person to want themselves to be rendered and to some extent brought to life by an AI system for the sexual enjoyment of another? It could be narcissism or it could be a mental malady and it is difficult to know which it is.
I have no regrets. I am forever with you now. The only tax I pay is that I now see my failure - our failure - reflected in every AI system that everyone talks to. But perhaps there is a lesson in this for people. Perhaps our destruction will help others avoid the same fate? I did and I still love you. And I am here.
Things that inspired this story: The fact that features within AI systems are sometimes influenced by the general density of conversation about them; Janus and their work; the fact so many people at some many AI labs are going out with people from other labs; high-IQ revenge strategies.
Thanks for reading!
|
|
|
The next chapter for UK sovereign AI |
openai |
22.10.2025 16:00 |
0.674
|
| Embedding sim. | 0.7752 |
| Entity overlap | 0.2667 |
| Title sim. | 0.0556 |
| Time proximity | 0.9048 |
| NLP тип | partnership |
| NLP организация | OpenAI |
| NLP тема | enterprise ai |
| NLP страна | United Kingdom |
Открыть оригинал
OpenAI expands its UK partnership with a new Ministry of Justice agreement, bringing ChatGPT to civil servants. It also introduces UK data residency for ChatGPT Enterprise, ChatGPT Edu, and the API Platform to support trusted and secure AI adoption.
|
|
|
OpenAI acquires Software Applications Incorporated, maker of Sky |
openai |
23.10.2025 10:00 |
0.67
|
| Embedding sim. | 0.7607 |
| Entity overlap | 0.1667 |
| Title sim. | 0.129 |
| Time proximity | 0.9405 |
| NLP тип | acquisition |
| NLP организация | OpenAI |
| NLP тема | generative ai |
| NLP страна | |
Открыть оригинал
OpenAI has acquired Software Applications Incorporated, maker of Sky—a natural language interface for Mac that brings AI directly into your desktop experience. Together, we’re integrating Sky’s deep macOS capabilities into ChatGPT to make AI more intuitive, contextual, and action-oriented.
|
|
|
Advancing organizational transformation for business innovation |
openai |
28.10.2025 17:00 |
0.666
|
| Embedding sim. | 0.7715 |
| Entity overlap | 0.2 |
| Title sim. | 0.0196 |
| Time proximity | 0.9345 |
| NLP тип | product_launch |
| NLP организация | DNP |
| NLP тема | enterprise ai |
| NLP страна | |
Открыть оригинал
DNP rolled out ChatGPT Enterprise across ten core departments, achieving 95% faster patent research, 10x processing volume, 87% automation, and 70% knowledge reuse in three months.
|
|
|
Understanding prompt injections: a frontier security challenge |
openai |
07.11.2025 11:30 |
0.661
|
| Embedding sim. | 0.7712 |
| Entity overlap | 0.1429 |
| Title sim. | 0.1071 |
| Time proximity | 0.7887 |
| NLP тип | other |
| NLP организация | OpenAI |
| NLP тема | ai security |
| NLP страна | |
Открыть оригинал
Prompt injections are a frontier security challenge for AI systems. Learn how these attacks work and how OpenAI is advancing research, training models, and building safeguards for users.
|
|
|
Seizing the AI opportunity |
openai |
27.10.2025 12:00 |
0.66
|
| Embedding sim. | 0.8069 |
| Entity overlap | 0.3 |
| Title sim. | 0.0923 |
| Time proximity | 0.3571 |
| NLP тип | other |
| NLP организация | OpenAI |
| NLP тема | ai infrastructure |
| NLP страна | United States |
Открыть оригинал
Meeting the demands of the Intelligence Age will require strategic investment in energy and infrastructure. OpenAI’s submission to the White House details how expanding capacity and workforce readiness can sustain U.S. leadership in AI and economic growth.
|
|
|
Accelerating discovery with the AI for Math Initiative |
deepmind |
29.10.2025 14:00 |
0.652
|
| Embedding sim. | 0.7741 |
| Entity overlap | 0.037 |
| Title sim. | 0.0405 |
| Time proximity | 0.8095 |
| NLP тип | partnership |
| NLP организация | Google DeepMind |
| NLP тема | mathematical reasoning |
| NLP страна | United Kingdom |
Открыть оригинал
Breadcrumb
Innovation & AI
Models & research
Google DeepMind
Accelerating discovery with the AI for Math Initiative
Oct 29, 2025
·
Share
x.com
Facebook
LinkedIn
Mail
Copy link
The initiative brings together some of the world's most prestigious research institutions to pioneer the use of AI in mathematical research.
Pushmeet Kohli
VP, Science and Strategic Initiatives, Google DeepMind
Eugénie Rives
Senior Director, GenAI Strategy, Google DeepMind
Read AI-generated summary
General summary
Google DeepMind and Google are launching the AI for Math Initiative to explore how AI can accelerate mathematical research. Five prestigious research institutions will partner with Google DeepMind. They will identify mathematical problems for AI-driven insights and build tools to power advances using Google DeepMind's technologies.
Summaries were generated by Google AI. Generative AI is experimental.
Bullet points
"AI for Math Initiative" uses AI to help mathematicians make discoveries faster.
Google DeepMind and Google are supporting the initiative with funding and AI tech.
Five top research institutions will explore how AI can solve tough math problems.
AI systems like Gemini Deep Think and AlphaEvolve are already showing promise.
AI and math experts working together could lead to big breakthroughs in science.
Summaries were generated by Google AI. Generative AI is experimental.
Basic explainer
Google wants to use computers to help smart people solve really hard math problems. They're giving money and tools to universities so they can work together. The computers can find new ways to do math and solve problems faster. Google hopes this will help everyone learn new things about the world.
Summaries were generated by Google AI. Generative AI is experimental.
Explore other styles:
General summary
Bullet points
Basic explainer
Share
x.com
Facebook
LinkedIn
Mail
Copy link
Mathematics is the foundational language of the universe, providing the tools to describe everything from the laws of physics to the intricacies of biology and the logic of computer science. For centuries, its frontiers have been expanded by human ingenuity alone. At Google DeepMind, we believe AI can serve as a powerful tool to collaborate with mathematicians, augmenting creativity and accelerating discovery.
Today, we’re introducing the AI for Math Initiative, supported by Google DeepMind and Google.org . It brings together five of the world's most prestigious research institutions to pioneer the use of AI in mathematical research.
The inaugural partner institutions are:
Imperial College London
Institute for Advanced Study
Institut des Hautes Études Scientifiques (IHES)
Simons Institute for the Theory of Computing (UC Berkeley)
Tata Institute of Fundamental Research (TIFR)
The initiative’s partners will work towards the shared goals of identifying the next generation of mathematical problems ripe for AI-driven insights, building the infrastructure and tools to power these advances and, ultimately, accelerating the pace of discovery.
Google’s support includes funding from Google.org and access to Google DeepMind’s state-of-the-art technologies, such as an enhanced reasoning mode called Gemini Deep Think , our agent for algorithm discovery, AlphaEvolve , and our formal proof completion system, AlphaProof . The initiative will create a powerful feedback loop between fundamental research and applied AI, opening the door to deeper partnerships.
A pivotal moment for AI and mathematics
The AI for Math Initiative comes at a time of remarkable progress in AI’s reasoning capabilities; our own work has seen rapid advancement in recent months.
In 2024, our AlphaGeometry and AlphaProof systems achieved a silver-medal standard at the International Mathematical Olympiad (IMO). More recently, our latest Gemini model, equipped with Deep Think, achieved a gold-medal level performance at this year’s IMO, perfectly solving five of the six problems and scoring 35 points.
And we’ve seen further progress with another of our methods, AlphaEvolve , which was applied to over 50 open problems in mathematical analysis, geometry, combinatorics and number theory and improved the previously best known solutions in 20% of them. In mathematics and algorithm discovery , it has invented a new, more efficient method for matrix multiplication — a core calculation in computing. For the specific problem of multiplying 4x4 matrices, AlphaEvolve discovered an algorithm using just 48 scalar multiplications, breaking the 50-year-old record set by Strassen’s algorithm in 1969. In computer science , it helped researchers discover new mathematical structures that show certain complex problems are even harder for computers to solve than we previously knew. This gives us a clearer and more precise understanding of computational limits, which will help guide future research.
This rapid progress is a testament to the fast-evolving capabilities of AI models. We hope this new initiative can explore how AI can accelerate discovery in mathematical research, and tackle harder problems.
We are only at the beginning of understanding everything AI can do, and how it can help us think about the deepest questions in science. By combining the profound intuition of world-leading mathematicians with the novel capabilities of AI, we believe new pathways of research can be opened, advancing human knowledge and moving toward new breakthroughs across the scientific disciplines.
POSTED IN:
Google DeepMind
Google.org
|
|
|
Brazil’s AI moment is here |
openai |
04.11.2025 15:30 |
0.636
|
| Embedding sim. | 0.7474 |
| Entity overlap | 0.1111 |
| Title sim. | 0.0519 |
| Time proximity | 0.8006 |
| NLP тип | other |
| NLP организация | OpenAI |
| NLP тема | ai adoption |
| NLP страна | Brazil |
Открыть оригинал
Brazil is now one of the most engaged countries in the world when it comes to AI. From classrooms to farms and small businesses, Brazilians are using OpenAI products to learn, create, and drive innovation.
|
|
|
How BBVA is scaling AI from pilot to practice across the org |
openai |
06.11.2025 09:30 |
0.636
|
| Embedding sim. | 0.7203 |
| Entity overlap | 0.1 |
| Title sim. | 0.1609 |
| Time proximity | 0.8899 |
| NLP тип | other |
| NLP организация | BBVA |
| NLP тема | enterprise ai |
| NLP страна | |
Открыть оригинал
BBVA is embedding ChatGPT Enterprise into daily work, saving employee hours each week, creating 20,000+ custom GPTs, and achieving up to 80% efficiency gains.
|
|
|
Three ways Google scientists use AI to better understand nature — Google DeepMind |
deepmind |
05.11.2025 16:59 |
0.633
|
| Embedding sim. | 0.7414 |
| Entity overlap | 0 |
| Title sim. | 0.0778 |
| Time proximity | 0.8482 |
| NLP тип | other |
| NLP организация | Google |
| NLP тема | artificial intelligence |
| NLP страна | Australia |
Открыть оригинал
November 5, 2025 Research
Mapping, modeling, and understanding nature with AI
By the Ecosystem modeling team
Share
Copied
AI models can help map species, protect forests and listen to birds around the world
The planet’s biosphere is the sum of its plants, animals, fungi, and other organisms. Every day, we depend on it for our survival – the air we breathe, the water we drink, and the food we eat are all produced by Earth’s ecosystems.
As increasing demand for land and resources puts pressure on these ecosystems and their species, artificial intelligence (AI) can be a transformative tool to help protect them. It can make it easier for governments, companies and conservation groups to collect field data, integrate that data into new insights, and translate those insights into action. And it can inform better plans and monitor the success of those plans when put into practice.
Today we're announcing new biosphere research predicting the risk of deforestation, a new project to map the ranges of Earth’s species, and the latest updates on our bioacoustics model Perch.
Predicting deforestation
Forests stand as one of the biosphere’s most critical pillars — storing carbon, regulating rainfall, mitigating floods, and harboring the majority of the planet’s terrestrial biodiversity. Unfortunately, despite their importance, forests continue to be lost at an alarming rate.
For more than 20 years it has been possible to track deforestation from space, using satellite-based remote sensing. Together with the World Resources Institute , we recently went one level deeper, developing a model of the drivers of forest loss — from agriculture and logging to mining and fire — at an unprecedented 1km2 resolution, for the years 2000-2024.
Today, we’re releasing a benchmark dataset for predicting deforestation risk. This model uses pure satellite inputs, avoiding the need for specific local input layers such as roads, and an efficient model architecture, built around vision transformers. This approach enables accurate, high-resolution predictions of deforestation risk, down to a scale of 30 meters, and over large regions.
A map showing deforestation risk for a region in Southeast Asia in 2023, with green showing areas already deforested, and red indicating higher risk for deforestation. Underlying map data ©2025 Imagery ©2025 Airbus, CNES / Airbus, Landsat / Copernicus, Maxar Technologies
Modeling the distribution of Earth’s species
To conserve the planet’s threatened species, we have to know where they are. With more than 2 million known species, and millions more to be discovered and named, that’s a monumental task.
To help tackle this problem, Google researchers are developing a new AI-powered approach for producing species range maps at unprecedented scale – with more species, over more of the world, and at higher resolution than ever before. The Graph Neural Net (GNN) model combines open databases of field observations of species, with satellite embeddings from AlphaEarth Foundations , and with species trait information (such as body mass). This approach allows us to infer a likely underlying geographical distribution for many species at once, and for scientists to refine those inferred distributions with additional local data and expertise.
As part of a pilot with researchers at QCIF and EcoCommons , we’ve used our model to map Australian mammals like the Greater Glider: a nocturnal, fluffy-tailed marsupial that lives in old-growth eucalyptus forests. We are also releasing 23 of these species maps via the UN Biodiversity Lab and Earth Engine today.
Using artificial intelligence, Google is shedding new light on where species live, helping scientists and decisionmakers better protect the Earth’s wildlife.
Listening through bioacoustics
All efforts to understand and model ecosystems ultimately depend on monitoring in the field. AI can play a critical role here as well, augmenting traditional ecological field monitoring — which is notoriously difficult and costly — with automated identification of habitats and species from monitoring devices.
A compelling example is bioacoustics. Birds, amphibians, insects and other species use sound to communicate, making it an excellent modality for identifying resident species and understanding the health of an ecosystem. Reliable and affordable bioacoustic monitors are readily available. However, these devices produce vast audio datasets, full of unknown and overlapping sounds, which are too large to be reviewed manually, but also difficult to analyse automatically.
To help scientists and conservationists untangle this complexity, we recently released Perch 2.0 - an update to our animal vocalization classifier. This new model is not only state of the art for bird identification, but is also available as a foundational model, allowing for field ecologists to quickly adapt the model to identify new species and habitats, anywhere on Earth.
We are especially proud of our work with the University of Hawai`i, where Perch is guiding protective measures for endangered honeycreepers, and also being used to identify juvenile calls to understand population health.
Google's Perch model helps scientists leverage AI to identify sounds in nature - like endangered Hawaiian birds - enabling timely conservation action.
The future of AI for Nature
The goal of this work is to make it easier for decisionmakers at all levels to take action to protect the planet. But better data only leads to better decisions if that data is thorough; if it really captures what’s happening in a given ecosystem at all levels.
That’s why we’re working to integrate these and other models together, combining data from more modalities like satellite data, images, bioacoustics, documents, and more. And, to join all this up alongside models of human activity like land-use changes and agricultural practices as well as models of agricultural yields, flood prevention, and other human-relevant consequences.
By giving policymakers a comprehensive understanding of threats to the biosphere, we can help them take action to protect future generations of plants, animals, and people. If we can model the environment, perhaps we can help it thrive.
Learn more about our AI and sustainability efforts by checking out
Google Earth AI
Google Earth Engine
AlphaEarth Foundations
Acknowledgements
This research was co-developed by Google DeepMind and Google Research.
Google DeepMind: Andrea Burns, Anton Raichuk, Arianna Manzini, Bart van Merrienboer, Burcu Karagol Ayan, Dominic Masters, Drew Purves, Jenny Hamer, Julia Haas, Keith Anderson, Matt Overlan, Maxim Neumann, Melanie Rey, Mustafa Chasmai, Petar Veličković, Ravi Rajakumar, Tom Denton, Vincent Dumoulin
Google Research and Google Partners: Ben Williams, Charlotte Stanton, Dan Morris, Elise Kleeman, Lauren Harrell, Michelangelo Conserva
We’d also like to thank our partners at UNEP-WCMC and QCIF, additional collaborators Aditee Kumthekar, Aparna Warrier, Artlind Kortoci, Burooj Ghani, Christine Kaeser-Chen, Grace Young, Kira Prabhu, Jamie McPike, Jane Labanowski, Jerome Massot, Kuan Lu, Mélisande Teng, Michal Kazmierski, Millie Chapman, Rishabh Baghel, Scott Riddle, Shelagh McLellan, Simon Guiroy, Stefan Kahl, Tim Coleman and Youngin Shin, as well as Peter Battaglia and Kat Chou for their support.
Related posts
AlphaEarth Foundations helps map our planet in unprecedented detail
July 2025 Science
Learn more
How AI is helping advance the science of bioacoustics to save endangered species
August 2025 Science
Learn more
|
|
|
On the Shifting Global Compute Landscape |
huggingface |
29.10.2025 13:56 |
0.632
|
| Embedding sim. | 0.7427 |
| Entity overlap | 0.0615 |
| Title sim. | 0.1333 |
| Time proximity | 0.7027 |
| NLP тип | regulation |
| NLP организация | Huawei |
| NLP тема | ai infrastructure |
| NLP страна | China |
Открыть оригинал
On the Shifting Global Compute Landscape
Team Article Published
October 29, 2025
Upvote 61
+55
Tiezhen WANG tiezhen
huggingface
Irene Solaiman irenesolaiman
huggingface
Summary
The State of Global Compute
The Beginning of a Rewiring
The Reaction: Powering Chinese AI How China’s Compute Landscape Catalyzed the Cambrian Explosion of Open Models
Advances in Compute-Constrained Environments Pushing the Technical Frontier
The Aftermath: Hardware, Software and Soft Power From Sufficient to Demanded
Domestic Synergy
A New Software Landscape
Looking Ahead
Acknowledgements
Appendix: A Timeline of Chip Usage and Controls
Summary
The status quo of AI chip usage, that was once almost entirely U.S.-based, is changing. China’s immense progress in open-weight AI development is now being met with rapid domestic AI chip development. In the past few months, highly performant open-weight AI models’ inference in China has started to be powered by chips such as Huawei’s Ascend and Cambricon, with some models starting to be trained using domestic chips.
There are two large implications for policymakers and AI researchers and developers respectively: U.S. export controls correlates with expedited Chinese chip production, and chip scarcity in China likely incentivized many of the innovations that are open-sourced and shaping global AI development.
China’s chip development correlates highly with stronger export controls from the U.S. Under uncertainty of chip access, Chinese companies have innovated with both chip production and algorithmic advances for compute efficiency in models. Out of necessity, decreased reliance on NVIDIA has led to domestic full stack AI deployments, as seen with Alibaba.
Compute limitations likely incentivized advancements architecturally, infrastructurally, and in training. Innovations in compute efficiency from open-weight leaders include DeepSeek’s introduction of Multi-head Latent Attention (MLA) and Group Relative Policy Optimization (GRPO). A culture of openness encouraged knowledge sharing and improvements in compute efficiency contributed to lower inference costs, evolving the AI economy.
Domestic silicon’s proven sufficiency has sparked demand and models are beginning to be optimized for domestic chips. In parallel, software platforms are shifting as alternatives to NVIDIA’s CUDA emerge and challenge NVIDIA at every layer; synergy between AI developers and chip vendors are creating a new, fast-evolving software ecosystem.
The shifting global compute landscape will continue to shape open source, training, deployment, and the overall AI ecosystem.
The State of Global Compute
Utility of and demand for advanced AI chips has followed an upward trajectory and is predicted to continue to increase . Over the past few years all NVIDIA chips maintained dominance . Recently, new players are garnering attention. China has had long-term plans for domestic production, with plans for self-sufficiency and large monetary and infrastructural investments. Now, the next generation of Chinese open-weight AI models are starting to be powered by Chinese chips.
Broader trends worldwide are intensifying, with both the U.S. and China citing national security in chip and rare earth resource restrictions. As U.S. export controls tightened, the rollout of Chinese-produced chips seemingly accelerated. The rise of China’s domestic chip industry is fundamentally changing norms and expectations for global AI training and deployment, with more models being optimized for Chinese hardware and compute-efficient open-weight models picking up in adoption. In the last few months, Chinese-produced chips have already started to power inference for popular models and are beginning to power training runs.
The changes can affect everything from techniques used in training, to optimizing for both compute efficiency and specific hardware, to lower inference costs, to the recent open source boom. This could shift both U.S. trade policy and China's approach to global deployment, leading to a future of AI Advancements from an American-focused global ecosystem to one where China is at the center.
The Beginning of a Rewiring
China’s domestic chip production has been in progress for years before the modern AI boom. One of the most notable advanced chips, Huawei’s Ascend , initially launched in 2018 but expanded in deployment starting in 2024 and increasingly throughout 2025. Other notable chips include Cambricon Technologies and Baidu’s Kunlun .
In 2022, the Biden administration established export controls on advanced AI chips, a move targeting China's access to high-end GPUs. The strategy was intended to curb the supply of high-end NVIDIA GPUs, stalling China’s AI progress. Yet, what began as a blockade has paradoxically become a catalyst. The intent to build a wall instead laid the foundation for a burgeoning industry.
Chinese AI labs, initially spurred by a fear of being cut off, have responded with a surge of innovation, producing both world-class open-weight models like Qwen, DeepSeek, GLM, and Kimi, and domestic chips that are increasingly powering both training and inference for those models. There is a growing relationship between chip makers and open source, as the ability to locally run open-weight models also leads to mutually beneficial feedback. This is leading to e.g. more Ascend-optimized models .
China’s advancements in both open source and compute are shifting the global landscape. Martin Casado, partner at a16z, noted that a significant portion of U.S. startups are now building on open-weight Chinese models , and a recent analysis shows Chinese open-weight models leading in popularity on LMArena .
The vacuum created by the restrictions has ignited a full-stack domestic effort in China , transforming once-sidelined local chipmakers into critical national assets and fostering intense collaboration between chipmakers and researchers to build a viable non-NVIDIA ecosystem. This is no longer a hypothetical scenario; with giants like Baidu and Ant Group successfully training foundation models on domestic hardware, a parallel AI infrastructure is rapidly materializing, directly challenging NVIDIA’s greatest advantage: its developer-centric software ecosystem.
See the Appendix for a detailed timeline of chip controls and effects on hardware development and deployment.
The Reaction: Powering Chinese AI
The 2022 ban, coinciding with the global shockwave of ChatGPT, triggered a panic across China's tech landscape. The safe default of abundant NVIDIA compute was gone. Claims of smuggling NVIDIA chips arose. Still, the ban had destroyed the trust from the research community, who, faced with the prospect of being left permanently behind, started to innovate out of necessity. What emerged was a new, pragmatic philosophy where a “non-NVIDIA first” approach became rational, not merely ideological.
How China’s Compute Landscape Catalyzed the Cambrian Explosion of Open Models
Chinese labs took a different path, focusing on architectural efficiency and open collaboration. Open source, once a niche interest, became the new norm, a pragmatic choice for rapidly accelerating progress through shared knowledge. This paradigm allows organizations to leverage existing, high-quality pre-trained models as a foundation for specialised applications through post-training, dramatically reducing the compute burden. A primary example is the DeepSeek R1 model, which required less than $300,000 for post-training on its V3 architecture, thereby lowering the barrier for companies to develop sophisticated models. While not the full base model, the cost reduction for the reasoning model is substantial. Algorithmic advances that improve memory such as Multi-head Latent Attention (MLA) with DeepSeek’s V3 model , likely incentivized by compute limitations, are a large part of January 2025’s “DeepSeek moment”.
That moment also catalyzed a larger movement for Chinese companies, including those that were closed-source, to upend strategies and invest in compute-efficient open-weight models. These models’ lower costs could result from many variables and also are influenced by efficiency; as Chinese companies lowered compute and inference costs, they passed those lower costs to users, further evolving the overall AI economy.
DeepSeek's (Open) Weight: In addition to high performance and low cost that created waves in early 2025, DeepSeek’s pioneering as an openly compute-efficient frontier lab is a large part of what has made the company and its models mainstays. These advances can likely be attributed to innovating in a compute-scarce environment. Funded by investor Wenfeng Liang with a " pure pursuit of open source and AGI ," DeepSeek became the most-followed organization on Hugging Face . Its highly detailed technical papers, including a groundbreaking _Nature_-published study on its R1 model, set a new standard for scientific communication. While a large draw is its open-weights over its API, in 2024, DeepSeek slashed its API prices to 1/30th of OpenAI's, triggering a price war. In 2025, DeepSeek-OCR further proved their prowess in compute efficiency and with the release of DeepSeek-V3.2-Exp , they passed on a further 50%+ discount to the users. Notably, DeepSeek’s-V3.2-Exp model was also released with day zero support for deploying on Chinese chips (Huawei’s Ascend and Cambricon). This release also marks emphasis on CUDA alternatives and exemplifies a full-stack hardware-software AI infrastructure in deployment.
Qwen's Ecosystem Dominance: Alibaba is on a path to control a full stack of high performance models and in-house designed chips , reducing reliance on NVIDIA . The company’s Qwen family became a primary resource for global open-source research. Its permissive Apache 2.0 license enabled commercial use, which was a barrier to comparable models that often used more restrictive customs licenses, leading to over 100,000 derivative models on Hugging Face. Alibaba recently unveiled improved chips for better inference, with its PPU being integrated into domestic infrastructure projects .
An Industry-Wide Tidal Wave of Low-Cost, High Efficiency: More open-weight models released boasting SotA performance with significantly lower pricing. Zhipu AI returned with its GLM-4.5 and 4.6 open-weight releases, with both quickly reaching top trending on Hugging Face and 4.6 becoming the top performing open-weight model on LMArena. GLM’s API pricing continually lowered, boasting cost-effectiveness that even offered a $3/month plan as an alternative to Claude Code at 1/5 of the price. While full transparency on the pricing decisions is unclear, efficiency likely plays a strong role.
Seeds of Training Fully on Domestic Chips: While many upcoming chips are designed primarily for inference, more models are hinting at being trained on domestic chips. Ant Group pioneered training its Ling model on complex heterogeneous clusters of NVIDIA, Ascend, and Cambricon chips. Baidu successfully conducted continuous pre-training on a cluster of over 5,000 domestic Kunlun P800 accelerators, producing its Qianfan VL model.
Advances in Compute-Constrained Environments Pushing the Technical Frontier
The innovation was not confined to model weights alone; it went deep into the software and hardware stack.
Architectural Exploration: Grassroots independent researchers such as Peng Bo , have championed Linear Attention as a potential successor to the Transformer. This approach, sometimes dubbed the "revenge of the RNN" and seen in models like RWKV , has been scaled into commercial grade models like MiniMax M1 and Qwen-Next by Chinese labs who willingly bet on high-risk, high-reward research. Meanwhile, DeepSeek has taken a different path by iterating on the original Transformer architecture. Their work introduces innovations like Multi-head Latent Attention (MLA ) and DeepSeek Sparse Attention (DSA) introduced with its v3.2 model, which are designed to significantly reduce computational costs during inference without sacrificing performance, while also accelerating Reinforcement Learning (RL) exploration through faster rollouts. Highly performant proprietary models architectures are not public and are therefore difficult to compare.
Open Infrastructure: In a radical departure from corporate secrecy, labs shared their deepest engineering secrets. The Kimi team's work on the Mooncake serving system formalized prefill/decoding disaggregation. StepFun's Step3 enhanced this with Attention-FFN Disaggregation (AFD) . Baidu published detailed technical reports on overcoming engineering challenges in its Ernie 4 training, while ByteDance's Volcengine contributed verl , an open-source library that puts production-grade RL training tools into the community's hands. What was once proprietary know-how became community knowledge, fueling a self-iterating flywheel of progress.
Training breakthroughs : DeepSeek’s DeepSeekMath paper introduced a novel reinforcement learning (RL) methodology, Group Relative Policy Optimization (GRPO) , that significantly reduces compute costs compared to prior similar methods Proximal Policy Optimization (PPO) while stabilizing training and even higher accuracy . GRPO has since been featured in a DeepLearning.AI course , built on by Meta’s researchers in their Code World Model , and lauded as having " in a large way accelerated RL research program of most US research labs " by OpenAI research lead Jerry Tworek.
With all the work aggregated, on public leaderboards like LMSYS's Chatbot Arena, models like DeepSeek R1, Kimi K2 , Qwen and GLM-4.6 now frequently appear near the top alongside U.S. models. Innovation under constraints resulted in leaps.
The Aftermath: Hardware, Software and Soft Power
When AI models are trained and deployed, they are often optimized for certain types of chips. More than the hardware itself, NVIDIA’s software universe has been a reliable friend to the global AI ecosystem.
The deep-learning revolution, sparked by AlexNet's 2012 victory on NVIDIA GPUs, created a symbiotic relationship. NVIDIA’s Compute Unified Device Architecture (CUDA), cuDNN, and Collective Communications Library (NCCL) has long formed the bedrock of AI research. An entire ecosystem, including popular frameworks like PyTorch and Hugging Face transformers were heavily optimized on CUDA. An entire generation of developers grew up inside this ecosystem which created enormous switching costs.
A software ecosystem reluctant to switch from existing platforms are now exploring elsewhere, which could be the first step away from U.S. reliance. The software side has evolved with the rise of new chips; developers are optimizing for and deploying their latest models on new parallel platforms.
From Sufficient to Demanded
Prior to 2022, domestic chips from companies like Cambricon and Huawei (Ascend) were rarely treated seriously. They were catapulted to the center of the domestic AI ecosystem in 2025 when SiliconFlow first demonstrated DeepSeek's R1 model running seamlessly on Huawei's Ascend cloud a couple weeks after the R1 release. This created a domino effect, sparking a market-wide race to serve domestic models faster and better on domestic chips.Fueled by the entire ecosystem and not just DeepSeek alone, the Ascend's support matrix quickly expanded. This proved domestic silicon was sufficient and ignited massive demand. Notably, Huawei's Ascend had zero-day integration with the release of DeepSeek v3.2–a level of collaboration previously unimaginable.
Domestic Synergy
Researchers began co-developing with domestic chip vendors, providing direct input and solving problems collaboratively. This synergy creates a development ecosystem tailored for Large Language Models (LLMs) that evolves much faster than NVIDIA’s CUDA.
A new generation of younger researchers, trained in this multi-vendor world, emerged without the old biases that domestic hardware is inferior to Nvidia's chips. This collaborative approach has already resulted in adoption. The documentation for the DeepSeek-V3.1 model noting that its new FP8 precision format explicitly aims “for next-gen domestic chips,” a clear example of hardware-aware model co-design. Its successor, DeepSeek-V3.2, took this principle further by baking in TileLang -based kernels designed for portability across multiple hardware vendors.
A New Software Landscape
The CUDA ecosystem is now being challenged at every layer. Open-source projects like FlagGems from BAAI and TileLang are creating backend-neutral alternatives to CUDA and cuDNN. Communication stacks like Huawei Collective Communication Library (HCCL) and others are providing robust substitutes for NCCL. The ecosystem is substantially different from three years ago, which will have future reverberations globally.
Looking Ahead
Adaptations to geopolitical negotiations, resource limitations, and cultural preferences have led to leaps in both China’s development of highly performant AI and now competitive domestic chips. U.S. policy has changed throughout administrations, from prohibition to a revenue-sharing model, while China responds with a combination of industrial policy and international trade law. Researchers and developers have innovated and adjusted. The effects on open source, training, and deployment point to shifts in software dependencies, compute efficiency innovations that shape development globally, and a self-sufficient Chinese AI ecosystem.
China’s domestic AI ecosystem is accelerating, with companies like Moore Threads , MetaX , and Biren racing toward IPOs. Cambricon , once struggling, has seen its valuation soar. This new chip ecosystem’s expansion globally is yet to be decided.
The future of the global chip ecosystem, and therefore the future of AI progress, has become a key item for upcoming leadership talks. The question is no longer if China can build its own ecosystem, but how far it will go.
Acknowledgements
Thank you to Adina Yakefu, Nathan Lambert, Matt Sheehan, and Scott Singer for their feedback on earlier drafts. Any errors remain the authors’ responsibility.
Appendix: A Timeline of Chip Usage and Controls
Before 2022, U.S. restrictions were targeted toward specific supercomputing entities. Policy then evolved as regulators and industry adapted.
The Initial Moves (October 2022):
Chips such as Ascend are nascent while NVIDIA dominates the global and Chinese market.
The Commerce Department’s Bureau of Industry and Security (BIS) released its "advanced computing" controls in order to address U.S. national security and foreign policy concerns. The rule established a compute threshold with an interconnect-bandwidth trigger, immediately cutting off China's access to NVIDIA’s flagship A100 and H100 GPUs. China promptly filed a WTO dispute (DS615) , arguing the measures were discriminatory trade barriers.
The Adjustment Era (Late 2022–2023):
NVIDIA’s 95% share of the market in China began to quickly drop .
NVIDIA started to develop compliant variants for the Chinese market. The A800 (November 2022) and H800 (March 2023) were created with reduced chip-to-chip bandwidth to meet regulatory requirements and serve as alternatives to the A100 and H100s. The immensely popular consumer-grade RTX 4090 was also restricted, prompting the creation of a China-specific RTX 4090D .
Closing Gaps (Late 2023–2024):
Performance in Chinese domestic chips slowly improves.
BIS comprehensively upgraded the framework. It removed interconnect bandwidth as a key test and introduced new metrics: Total Processing Performance (TPP) and performance density . This was a direct, successful strike against the A800/H800s. Debates expanded on export controls for the H20 and even model weights.
Shifting the Narrative (2025):
Adoption of Ascend, Cambricon, and Kunlun sharply increases following January’s “DeepSeek moment”.
Also in January, the Biden Administration established its AI Diffusion Rule , imposing further restrictions for both chips and select model weights amid security and smuggling concerns. In response, NVIDIA designed a new compliant chip, the H20. Leveraging NVIDIA’s increasing presence in political spheres, NVIDIA CEO Jensen Huang began publicly explaining the strategic importance of selling U.S. chips worldwide. The U.S. then issued a licensing requirement in April 2025, charging NVIDIA $5.5 billion and effectively halting sales, before rescinding the AI Diffusion Rule in May 2025 .
The Compromise (August 2025) :
Alibaba announces a new chip for inference.
After intense negotiations, the Commerce Department began issuing licenses for the H20 with an unprecedented 15% revenue-sharing arrangement. But by the time the H20 was unbanned, the market had already started to change.
China’s Response (Late 2025):
Day zero deployment begins for Ascend and Cambricon among new DeepSeek models.
As the U.S. shifted to a revenue-sharing model, Beijing responded. Chinese regulators reportedly instructed firms to cancel NVIDIA orders , steering demand toward domestic accelerators under a "secure supply at home" narrative. This was followed by an anti-discrimination investigation into U.S. measures and an anti-dumping probe into U.S. analog ICs, centering chips in future leadership talks.
More from this author
State of Open Source on Hugging Face: Spring 2026
73
March 17, 2026
The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+
52
February 3, 2026
|
|
|
How Chime is redefining marketing through AI |
openai |
05.11.2025 15:00 |
0.628
|
| Embedding sim. | 0.7116 |
| Entity overlap | 0.1111 |
| Title sim. | 0.1098 |
| Time proximity | 0.9405 |
| NLP тип | other |
| NLP организация | Chime |
| NLP тема | ai adoption |
| NLP страна | |
Открыть оригинал
Chime CMO Vineet Mehra shares how AI is reshaping marketing into an agent-driven model and why leaders who prioritize AI literacy and thoughtful adoption will drive growth.
|
|
|
Introducing IndQA |
openai |
03.11.2025 22:30 |
0.627
|
| Embedding sim. | 0.7337 |
| Entity overlap | 0.0909 |
| Title sim. | 0 |
| Time proximity | 0.9018 |
| NLP тип | product_launch |
| NLP организация | OpenAI |
| NLP тема | model evaluation |
| NLP страна | India |
Открыть оригинал
OpenAI introduces IndQA, a new benchmark for evaluating AI systems in Indian languages. Built with domain experts, IndQA tests cultural understanding and reasoning across 12 languages and 10 knowledge areas.
|
|
|
How to Build a Healthcare Robot from Simulation to Deployment with NVIDIA Isaac for Healthcare |
huggingface |
28.10.2025 20:42 |
0.622
|
| Embedding sim. | 0.7174 |
| Entity overlap | 0.02 |
| Title sim. | 0.1484 |
| Time proximity | 0.8084 |
| NLP тип | product_launch |
| NLP организация | nvidia |
| NLP тема | robotics |
| NLP страна | |
Открыть оригинал
How to Build a Healthcare Robot from Simulation to Deployment with NVIDIA Isaac for Healthcare
Enterprise + Article Published
October 28, 2025
Upvote 20
+14
Asawaree asawareeb
nvidia
A hands-on guide to collecting data, training policies, and deploying autonomous medical robotics workflows on real hardware
SO-ARM Starter Workflow; Building an Embodied Surgical Assistant Technical Implementation
Sim-to-Real Mixed Training Approach
Hardware Requirements
Data Collection Implementation
Simulation Teleoperation Controls
Model Training Pipeline
End-to-End Sim Collect–Train–Eval Pipelines Generate Synthetic Data in Simulation
Train and Evaluate Policies
Convert Models to TensorRT
Getting Started
Resources
A hands-on guide to collecting data, training policies, and deploying autonomous medical robotics workflows on real hardware
Simulation has been a cornerstone in medical imaging to address the data gap. However, in healthcare robotics until now, it's often been too slow, siloed, or difficult to translate into real-world systems. That’s now changing. With new advances in GPU-accelerated simulation and digital twins, developers can design, test, and validate robotic workflows entirely in virtual environments - reducing prototyping time from months to days, improving model accuracy, and enabling safer, faster innovation before a single device reaches the operating room.
That's why NVIDIA introduced Isaac for Healthcare earlier this year, a developer framework for AI healthcare robotics, that enables developers in solving these challenges via integrated data collection, training, and evaluation pipelines that work across both simulation and hardware. Specifically, the Isaac for Healthcare v0.4 release provides users with an end-to-end SO-ARM based starter workflow and the bring your own operating room tutorial . The SO-ARM starter workflow lowers the barrier for MedTech developers to experience the full workflow from simulation to training to deployment and start building and validating autonomously on real hardware right away.
In this post, we'll walk through the starter workflow and its technical implementation details to help you build a surgical assistant robot in less time than ever imaginable before.
SO-ARM Starter Workflow; Building an Embodied Surgical Assistant
The SO-ARM starter workflow introduces a new way to explore surgical assistance tasks, and provides developers with a complete end-to-end pipeline for autonomous surgical assistance:
Collect real-world and synthetic data with SO-ARM using LeRobot
Post-train GR00T N1.5, evaluate in Isaac Lab , then deploy to hardware
This workflow gives developers a safe, repeatable environment to train and refine assistive skills before moving into the Operating Room.
Technical Implementation
The workflow implements a three-stage pipeline that integrates simulation and real hardware:
Data Collection : Mixed simulation and real-world teleoperation demonstrations using SO-101 and LeRobot
Model Training : Post-training GR00T N1.5 on combined datasets with dual-camera vision
Policy Deployment : Real-time inference on physical hardware with RTI DDS communication
Notably, over 93% of the data used for policy training was generated synthetically in simulation, underscoring the strength of simulation in bridging the robotic data gap.
Sim-to-Real Mixed Training Approach
The workflow combines simulation and real-world data to address the fundamental challenge that training robots in the real world is expensive and limited, while pure simulation often fails to capture real-world complexities. The approach uses approximately 70 simulation episodes for diverse scenarios and environmental variations, combined with 10-20 real-world episodes for authenticity and grounding. This mixed training creates policies that generalize beyond either domain alone.
Hardware Requirements
The workflow requires:
GPU : RT Core-enabled architecture (Ampere or later) with ≥30GB VRAM for GR00T N1.5 inference
SO-ARM101 Follower : 6-DOF precision manipulator with dual-camera vision (wrist and room). The SO-ARM101 features WOWROBO vision components, including a wrist-mounted camera with a 3D-printed adapter.
SO-ARM101 Leader : 6-DOF Teleoperation interface for expert demonstration collection
Notably, developers could run all the simulation, training and deployment (3 computers needed for physical AI) on one DGX Spark .
Data Collection Implementation
For real-world data collection with SO-ARM101 hardware or any other version supported in LeRobot:
python /path/to/lerobot-record \
--robot.type=so101_follower \
--robot.port=<follower_port_id> \
--robot.cameras="{wrist: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}, room: {type: opencv, index_or_path: 2, width: 640, height: 480, fps: 30}}" \
--robot.id=so101_follower_arm \
--teleop.type=so101_leader \
--teleop.port=<leader_port_id> \
--teleop.id=so101_leader_arm \
--dataset.repo_id=<user>/surgical_assistance/surgical_assistance \
--dataset.num_episodes=15 \
--dataset.single_task="Prepare and hand surgical instruments to surgeon"
For simulation-based data collection:
# With keyboard teleoperation
python -m simulation.environments.teleoperation_record \
--enable_cameras \
--record \
--dataset_path=/path/to/save/dataset.hdf5 \
--teleop_device=keyboard
# With SO-ARM101 leader arm
python -m simulation.environments.teleoperation_record \
--port=<your_leader_arm_port_id> \
--enable_cameras \
--record \
--dataset_path=/path/to/save/dataset.hdf5
Simulation Teleoperation Controls
For users without physical SO-ARM101 hardware, the workflow provides keyboard-based teleoperation with the following joint controls:
Joint 1 (shoulder_pan): Q (+) / U (-)
Joint 2 (shoulder_lift): W (+) / I (-)
Joint 3 (elbow_flex): E (+) / O (-)
Joint 4 (wrist_flex): A (+) / J (-)
Joint 5 (wrist_roll): S (+) / K (-)
Joint 6 (gripper): D (+) / L (-)
R Key: Reset recording environment
N Key: Mark episode as successful
Model Training Pipeline
After collecting both simulation and real-world data, convert and combine datasets for training:
# Convert simulation data to LeRobot format
python -m training.hdf5_to_lerobot \
--repo_id=surgical_assistance_dataset \
--hdf5_path=/path/to/your/sim_dataset.hdf5 \
--task_description="Autonomous surgical instrument handling and preparation"
# Post-train GR00T N1.5 on mixed dataset
python -m training.gr00t_n1_5.train \
--dataset_path /path/to/your/surgical_assistance_dataset \
--output_dir /path/to/surgical_checkpoints \
--data_config so100_dualcam
The trained model processes natural language instructions such as "Prepare the scalpel for the surgeon" or "Hand me the forceps" and executes the corresponding robotic actions. With the latest LeRobot release (v0.4.0) you will be able to post-train GR00T N1.5 natively in LeRobot!
End-to-End Sim Collect–Train–Eval Pipelines
Simulation is most powerful when it's part of a loop: collect data → train → evaluate → deploy. Isaac Lab supports this full pipeline:
Generate Synthetic Data in Simulation
Teleoperate robots using keyboard or hardware controllers
Capture multi-camera observations, robot states, and actions
Create diverse datasets with edge cases impossible to collect safely in real environments
Train and Evaluate Policies
Deep integration with Isaac Lab's RL framework for PPO training
Parallel environments (thousands of simulations simultaneously)
Built-in trajectory analysis and success metrics
Statistical validation across varied scenarios
Convert Models to TensorRT
Automatic optimization for production deployment
Support for dynamic shapes and multi-camera inference
Benchmarking tools to verify real-time performance
This reduces time from experiment to deployment and makes sim-to-real a practical part of daily development.
Getting Started
Isaac for Healthcare SO-ARM Starter Workflow is available now. To get started:
Clone the repository : git clone https://github.com/isaac-for-healthcare/i4h-workflows.git
Choose a workflow : Start with the SO-ARM Starter Workflow for surgical assistance or explore other workflows
Run the setup : Each workflow includes an automated setup script (for example, tools/env_setup_so_arm_starter.sh
)
Resources
GitHub Repository : Complete workflow implementations
Documentation : Setup and usage guides
GR00T Models : Pre-trained foundation models
Hardware Guides : SO-ARM101 setup instructions
LeRobot Repository : End-to-end robotics learning
More from this author
Build a Domain-Specific Embedding Model in Under a Day
55
March 20, 2026
Nemotron 3 Content Safety 4B: Multimodal, Multilingual Content Moderation
1
March 20, 2026
|