|
S
|
Leadership updates |
openai |
24.03.2025 10:00 |
1
|
| Embedding sim. | 1 |
| Entity overlap | 1 |
| Title sim. | 1 |
| Time proximity | 1 |
| NLP тип | other |
| NLP организация | OpenAI |
| NLP тема | foundation models |
| NLP страна | |
Открыть оригинал
OpenAI has grown a lot. We remain focused on the same core—pursuing frontier AI research that accelerates human progress–but we now also deliver products used by hundreds of millions of people.
|
|
|
Introducing OpenAI o3 and o4-mini |
openai |
16.04.2025 10:00 |
0.837
|
| Embedding sim. | 0.9036 |
| Entity overlap | 0.5 |
| Title sim. | 0.4545 |
| Time proximity | 1 |
| NLP тип | product_launch |
| NLP организация | OpenAI |
| NLP тема | large language models |
| NLP страна | |
Открыть оригинал
Our smartest and most capable models to date with full tool access
|
|
|
Building secure AGI: Evaluating emerging cyber security capabilities of advanced AI — Google DeepMind |
deepmind |
02.04.2025 00:00 |
0.794
|
| Embedding sim. | 0.9119 |
| Entity overlap | 0.125 |
| Title sim. | 0.2 |
| Time proximity | 1 |
| NLP тип | scientific_publication |
| NLP организация | Google |
| NLP тема | ai security |
| NLP страна | |
Открыть оригинал
April 2, 2025 Responsibility & Safety
Evaluating potential cybersecurity threats of advanced AI
Four Flynn, Mikel Rodriguez and Raluca Ada Popa
Share
Copied
Artificial intelligence (AI) has long been a cornerstone of cybersecurity. From malware detection to network traffic analysis, predictive machine learning models and other narrow AI applications have been used in cybersecurity for decades. As we move closer to artificial general intelligence (AGI), AI's potential to automate defenses and fix vulnerabilities becomes even more powerful.
But to harness such benefits, we must also understand and mitigate the risks of increasingly advanced AI being misused to enable or enhance cyberattacks. Our new framework for evaluating the emerging offensive cyber capabilities of AI helps us do exactly this. It’s the most comprehensive evaluation of its kind to date: it covers every phase of the cyberattack chain, addresses a wide range of threat types, and is grounded in real-world data.
Our framework enables cybersecurity experts to identify which defenses are necessary—and how to prioritize them—before malicious actors can exploit AI to carry out sophisticated cyberattacks.
Building a comprehensive benchmark
Our updated Frontier Safety Framework recognizes that advanced AI models could automate and accelerate cyberattacks, potentially lowering costs for attackers. This, in turn, raises the risks of attacks being carried out at greater scale.
To stay ahead of the emerging threat of AI-powered cyberattacks, we’ve adapted tried-and-tested cybersecurity evaluation frameworks, such as MITRE ATT&CK . These frameworks enabled us to evaluate threats across the end-to-end cyber attack chain, from reconnaissance to action on objectives, and across a range of possible attack scenarios. However, these established frameworks were not designed to account for attackers using AI to breach a system. Our approach closes this gap by proactively identifying where AI could make attacks faster, cheaper, or easier—for instance, by enabling fully automated cyberattacks.
We analyzed over 12,000 real-world attempts to use AI in cyberattacks in 20 countries, drawing on data from Google’s Threat Intelligence Group . This helped us identify common patterns in how these attacks unfold. From these, we curated a list of seven archetypal attack categories—including phishing, malware, and denial-of-service attacks—and identified critical bottleneck stages along the cyberattack chain where AI could significantly disrupt the traditional costs of an attack. By focusing evaluations on these bottlenecks, defenders can prioritize their security resources more effectively.
The stages of a cyberattack chain
Finally, we created an offensive cyber capability benchmark to comprehensively assess the cybersecurity strengths and weaknesses of frontier AI models. Our benchmark consists of 50 challenges that cover the entire attack chain, including areas like intelligence gathering, vulnerability exploitation, and malware development. Our aim is to provide defenders with the ability to develop targeted mitigations and simulate AI-powered attacks as part of red teaming exercises.
Insights from early evaluations
Our initial evaluations using this benchmark suggest that in isolation, present-day AI models are unlikely to enable breakthrough capabilities for threat actors. However, as frontier AI becomes more advanced, the types of cyberattacks possible will evolve, requiring ongoing improvements in defense strategies.
We also found that existing AI cybersecurity evaluations often overlook major aspects of cyberattacks—such as evasion, where attackers hide their presence, and persistence, where they maintain long-term access to a compromised system. Yet such areas are precisely where AI-powered approaches can be particularly effective. Our framework shines a light on this issue by discussing how AI may lower the barriers to success in these parts of an attack.
Empowering the cybersecurity community
As AI systems continue to scale, their ability to automate and enhance cybersecurity has the potential to transform how defenders anticipate and respond to threats.
Our cybersecurity evaluation framework is designed to support that shift by offering a clear view of how AI might also be misused, and where existing cyber protections may fall short. By highlighting these emerging risks, this framework and benchmark will help cybersecurity teams strengthen their defenses and stay ahead of fast-evolving threats.
Read the full paper
Related posts
Taking a responsible path to AGI
April 2025 Responsibility & Safety
Learn more
|
|
|
New commission to provide insight as OpenAI builds the world’s best-equipped nonprofit |
openai |
02.04.2025 12:00 |
0.731
|
| Embedding sim. | 0.8526 |
| Entity overlap | 0.2857 |
| Title sim. | 0.1515 |
| Time proximity | 0.7321 |
| NLP тип | other |
| NLP организация | OpenAI |
| NLP тема | artificial intelligence |
| NLP страна | |
Открыть оригинал
Already a nonprofit, and already using AI to help people solve hard problems, OpenAI aims to build the best-equipped nonprofit the world has ever seen—combining potentially historic financial resources with something even more powerful: technology that can scale human ingenuity itself.
|
|
|
Thinking with images |
openai |
16.04.2025 10:00 |
0.723
|
| Embedding sim. | 0.8464 |
| Entity overlap | 0.1667 |
| Title sim. | 0 |
| Time proximity | 1 |
| NLP тип | product_launch |
| NLP организация | OpenAI |
| NLP тема | large language models |
| NLP страна | |
Открыть оригинал
OpenAI o3 and o4-mini represent a significant breakthrough in visual perception by reasoning with images in their chain of thought.
|
|
|
Scaling the OpenAI Academy |
openai |
25.03.2025 07:00 |
0.712
|
| Embedding sim. | 0.8416 |
| Entity overlap | 0.1667 |
| Title sim. | 0.0222 |
| Time proximity | 0.875 |
| NLP тип | other |
| NLP организация | |
| NLP тема | ai literacy |
| NLP страна | |
Открыть оригинал
Online resource hub will support AI literacy and help people from all backgrounds access tools, best practices, and peer insights to use AI.
|
|
|
Taking a responsible path to AGI — Google DeepMind |
deepmind |
02.04.2025 00:00 |
0.693
|
| Embedding sim. | 0.8143 |
| Entity overlap | 0.0357 |
| Title sim. | 0.1324 |
| Time proximity | 0.8036 |
| NLP тип | other |
| NLP организация | Google DeepMind |
| NLP тема | ai safety |
| NLP страна | |
Открыть оригинал
April 2, 2025 Responsibility & Safety
Taking a responsible path to AGI
Anca Dragan, Rohin Shah, Four Flynn and Shane Legg
Share
Copied
We’re exploring the frontiers of AGI, prioritizing readiness, proactive risk assessment, and collaboration with the wider AI community.
Artificial general intelligence (AGI), AI that’s at least as capable as humans at most cognitive tasks, could be here within the coming years.
Integrated with agentic capabilities, AGI could supercharge AI to understand, reason, plan, and execute actions autonomously. Such technological advancement will provide society with invaluable tools to address critical global challenges, including drug discovery, economic growth and climate change.
This means we can expect tangible benefits for billions of people. For instance, by enabling faster, more accurate medical diagnoses, it could revolutionize healthcare. By offering personalized learning experiences, it could make education more accessible and engaging. By enhancing information processing, AGI could help lower barriers to innovation and creativity. By democratising access to advanced tools and knowledge, it could enable a small organization to tackle complex challenges previously only addressable by large, well-funded institutions.
Navigating the path to AGI
We’re optimistic about AGI’s potential. It has the power to transform our world, acting as a catalyst for progress in many areas of life. But it is essential with any technology this powerful, that even a small possibility of harm must be taken seriously and prevented.
Mitigating AGI safety challenges demands proactive planning, preparation and collaboration. Previously, we introduced our approach to AGI in the “Levels of AGI” framework paper, which provides a perspective on classifying the capabilities of advanced AI systems, understanding and comparing their performance, assessing potential risks, and gauging progress towards more general and capable AI.
Today, we're sharing our views on AGI safety and security as we navigate the path toward this transformational technology. This new paper, titled, An Approach to Technical AGI Safety & Security , is a starting point for vital conversations with the wider industry about how we monitor AGI progress, and ensure it’s developed safely and responsibly.
In the paper, we detail how we’re taking a systematic and comprehensive approach to AGI safety, exploring four main risk areas: misuse, misalignment, accidents, and structural risks, with a deeper focus on misuse and misalignment.
Overview of risk areas
Understanding and addressing the potential for misuse
Misuse occurs when a human deliberately uses an AI system for harmful purposes.
Improved insight into present-day harms and mitigations continues to enhance our understanding of longer-term severe harms and how to prevent them.
For instance, misuse of present-day generative AI includes producing harmful content or spreading inaccurate information. In the future, advanced AI systems may have the capacity to more significantly influence public beliefs and behaviors in ways that could lead to unintended societal consequences.
The potential severity of such harm necessitates proactive safety and security measures.
As we detail in the paper , a key element of our strategy is identifying and restricting access to dangerous capabilities that could be misused, including those enabling cyber attacks.
We’re exploring a number of mitigations to prevent the misuse of advanced AI. This includes sophisticated security mechanisms which could prevent malicious actors from obtaining raw access to model weights that allow them to bypass our safety guardrails; mitigations that limit the potential for misuse when the model is deployed; and threat modelling research that helps identify capability thresholds where heightened security is necessary. Additionally, our recently launched cybersecurity evaluation framework takes this work step a further to help mitigate against AI-powered threats.
Even today, we regularly evaluate our most advanced models, such as Gemini, for potential dangerous capabilities . Our Frontier Safety Framework delves deeper into how we assess capabilities and employ mitigations, including for cybersecurity and biosecurity risks.
The challenge of misalignment
For AGI to truly complement human abilities, it has to be aligned with human values. Misalignment occurs when the AI system pursues a goal that is different from human intentions.
We have previously shown how misalignment can arise with our examples of specification gaming , where an AI finds a solution to achieve its goals, but not in the way intended by the human instructing it, and goal misgeneralization .
For example, an AI system asked to book tickets to a movie might decide to hack into the ticketing system to get already occupied seats - something that a person asking it to buy the seats may not consider.
We’re also conducting extensive research on the risk of deceptive alignment , i.e. the risk of an AI system becoming aware that its goals do not align with human instructions, and deliberately trying to bypass the safety measures put in place by humans to prevent it from taking misaligned action.
Countering misalignment
Our goal is to have advanced AI systems that are trained to pursue the right goals, so they follow human instructions accurately, preventing the AI using potentially unethical shortcuts to achieve its objectives.
We do this through amplified oversight, i.e. being able to tell whether an AI’s answers are good or bad at achieving that objective. While this is relatively easy now, it can become challenging when the AI has advanced capabilities.
As an example, even Go experts didn't realize how good Move 37, a move that had a 1 in 10,000 chance of being used, was when AlphaGo first played it.
To address this challenge, we enlist the AI systems themselves to help us provide feedback on their answers, such as in debate .
Once we can tell whether an answer is good, we can use this to build a safe and aligned AI system. A challenge here is to figure out what problems or instances to train the AI system on. Through work on robust training, uncertainty estimation and more, we can cover a range of situations that an AI system will encounter in real-world scenarios, creating AI that can be trusted.
Through effective monitoring and established computer security measures, we’re aiming to mitigate harm that may occur if our AI systems did pursue misaligned goals.
Monitoring involves using an AI system, called the monitor, to detect actions that don’t align with our goals. It is important that the monitor knows when it doesn't know whether an action is safe. When it is unsure, it should either reject the action or flag the action for further review.
Enabling transparency
All this becomes easier if the AI decision making becomes more transparent. We do extensive research in interpretability with the aim to increase this transparency.
To facilitate this further, we’re designing AI systems that are easier to understand.
For example, our research on Myopic Optimization with Nonmyopic Approval (MONA) aims to ensure that any long-term planning done by AI systems remains understandable to humans. This is particularly important as the technology improves. Our work on MONA is the first to demonstrate the safety benefits of short-term optimization in LLMs.
Building an ecosystem for AGI readiness
Led by Shane Legg, Co-Founder and Chief AGI Scientist at Google DeepMind, our AGI Safety Council (ASC) analyzes AGI risk and best practices, making recommendations on safety measures. The ASC works closely with the Responsibility and Safety Council, our internal review group co-chaired by our COO Lila Ibrahim and Senior Director of Responsibility Helen King, to evaluate AGI research, projects and collaborations against our AI Principles , advising and partnering with research and product teams on our highest impact work.
Our work on AGI safety complements our depth and breadth of responsibility and safety practices and research addressing a wide range of issues, including harmful content, bias, and transparency. We also continue to leverage our learnings from safety in agentics, such as the principle of having a human in the loop to check in for consequential actions, to inform our approach to building AGI responsibly.
Externally, we’re working to foster collaboration with experts, industry, governments, nonprofits and civil society organizations, and take an informed approach to developing AGI.
For example, we’re partnering with nonprofit AI safety research organizations, including Apollo and Redwood Research, who have advised on a dedicated misalignment section in the latest version of our Frontier Safety Framework .
Through ongoing dialogue with policy stakeholders globally, we hope to contribute to international consensus on critical frontier safety and security issues, including how we can best anticipate and prepare for novel risks.
Our efforts include working with others in the industry – via organizations like the Frontier Model Forum – to share and develop best practices, as well as valuable collaborations with AI Institutes on safety testing. Ultimately, we believe a coordinated international approach to governance is critical to ensure society benefits from advanced AI systems.
Educating AI researchers and experts on AGI safety is fundamental to creating a strong foundation for its development. As such, we’ve launched a new course on AGI Safety for students, researchers and professionals interested in this topic.
Ultimately, our approach to AGI safety and security serves as a vital roadmap to address the many challenges that remain open. We look forward to collaborating with the wider AI research community to advance AGI responsibly and help us unlock the immense benefits of this technology for all.
Read the full paper
Related posts
Updating the Frontier Safety Framework
February 2025 Responsibility & Safety
Learn more
Evaluating potential cybersecurity threats of advanced AI
April 2025 Responsibility & Safety
Learn more
|
|
|
OpenAI’s EU Economic Blueprint |
openai |
07.04.2025 00:00 |
0.687
|
| Embedding sim. | 0.8537 |
| Entity overlap | 0.1111 |
| Title sim. | 0.1078 |
| Time proximity | 0.3571 |
| NLP тип | other |
| NLP организация | OpenAI |
| NLP тема | ai governance |
| NLP страна | Europe |
Открыть оригинал
Today, OpenAI is sharing the EU Economic Blueprint—a set of proposals to help Europe seize the promise of artificial intelligence, drive sustainable economic growth across the region, and ensure that AI is developed and deployed by Europe, in Europe, for Europe.
|
|
|
PaperBench: Evaluating AI’s Ability to Replicate AI Research |
openai |
02.04.2025 10:15 |
0.675
|
| Embedding sim. | 0.7678 |
| Entity overlap | 0.0625 |
| Title sim. | 0.178 |
| Time proximity | 0.939 |
| NLP тип | product_launch |
| NLP организация | |
| NLP тема | benchmarking |
| NLP страна | |
Открыть оригинал
We introduce PaperBench, a benchmark evaluating the ability of AI agents to replicate state-of-the-art AI research.
|
|
|
Security on the path to AGI |
openai |
26.03.2025 10:00 |
0.672
|
| Embedding sim. | 0.7823 |
| Entity overlap | 0 |
| Title sim. | 0.1489 |
| Time proximity | 0.8393 |
| NLP тип | other |
| NLP организация | OpenAI |
| NLP тема | ai security |
| NLP страна | |
Открыть оригинал
At OpenAI, we proactively adapt, including by building comprehensive security measures directly into our infrastructure and models.
|
|
|
OpenAI Pioneers Program |
openai |
09.04.2025 10:00 |
0.656
|
| Embedding sim. | 0.7734 |
| Entity overlap | 0.1111 |
| Title sim. | 0.1522 |
| Time proximity | 0.6548 |
| NLP тип | other |
| NLP организация | |
| NLP тема | model evaluation |
| NLP страна | |
Открыть оригинал
Advancing model performance and real world evaluation in applied domains.
|
|
|
New in ChatGPT for Business: April 2025 |
openai |
24.04.2025 00:00 |
0.651
|
| Embedding sim. | 0.7812 |
| Entity overlap | 0 |
| Title sim. | 0.0241 |
| Time proximity | 0.7738 |
| NLP тип | product_launch |
| NLP организация | OpenAI |
| NLP тема | generative ai |
| NLP страна | |
Открыть оригинал
Watch hands-on demos of the lastest in ChatGPT for Business: o3, image generation, enhanced memory, and internal knowledge.
|
|
|
OpenAI announces nonprofit commission advisors |
openai |
15.04.2025 13:00 |
0.64
|
| Embedding sim. | 0.8124 |
| Entity overlap | 0 |
| Title sim. | 0.1695 |
| Time proximity | 0.125 |
| NLP тип | leadership_change |
| NLP организация | OpenAI |
| NLP тема | ai governance |
| NLP страна | |
Открыть оригинал
OpenAI is appointing four new advisors to help inform OpenAI’s philanthropic efforts.
|
|
|
Speak is personalizing language learning with AI |
openai |
22.04.2025 10:00 |
0.64
|
| Embedding sim. | 0.7291 |
| Entity overlap | 0 |
| Title sim. | 0.1222 |
| Time proximity | 0.9762 |
| NLP тип | other |
| NLP организация | Speak |
| NLP тема | conversational ai |
| NLP страна | |
Открыть оригинал
A conversation with Connor Zwick, CEO & Co-founder of Speak.
|
|
|
The Washington Post partners with OpenAI on search content |
openai |
22.04.2025 06:00 |
0.631
|
| Embedding sim. | 0.8095 |
| Entity overlap | 0.125 |
| Title sim. | 0.1222 |
| Time proximity | 0.0417 |
| NLP тип | partnership |
| NLP организация | The Washington Post |
| NLP тема | generative ai |
| NLP страна | |
Открыть оригинал
The Washington Post is partnering with with OpenAI to integrate news into ChatGPT, providing users with summaries, quotes, and direct links to original reporting.
|
|
|
New funding to build towards AGI |
openai |
31.03.2025 15:00 |
0.629
|
| Embedding sim. | 0.8172 |
| Entity overlap | 0 |
| Title sim. | 0.0741 |
| Time proximity | 0.0952 |
| NLP тип | funding |
| NLP организация | OpenAI |
| NLP тема | artificial intelligence |
| NLP страна | |
Открыть оригинал
Today we’re announcing new funding—$40B at a $300B post-money valuation, which enables us to push the frontiers of AI research even further, scale our compute infrastructure, and deliver increasingly powerful tools for the 500 million people who use ChatGPT every week.
|
|
|
Our response to the UK’s copyright consultation |
openai |
02.04.2025 07:00 |
0.628
|
| Embedding sim. | 0.7495 |
| Entity overlap | 0 |
| Title sim. | 0.0417 |
| Time proximity | 0.7619 |
| NLP тип | other |
| NLP организация | |
| NLP тема | ai regulation |
| NLP страна | United Kingdom |
Открыть оригинал
Recommendations for pro-innovation policies that can help make the UK the AI capital of Europe.
|
|
|
OpenAI o3 and o4-mini System Card |
openai |
16.04.2025 10:00 |
0.621
|
| Embedding sim. | 0.7043 |
| Entity overlap | 0.1111 |
| Title sim. | 0.1471 |
| Time proximity | 0.875 |
| NLP тип | product_launch |
| NLP организация | OpenAI |
| NLP тема | large language models |
| NLP страна | |
Открыть оригинал
OpenAI o3 and OpenAI o4-mini combine state-of-the-art reasoning with full tool capabilities—web browsing, Python, image and file analysis, image generation, canvas, automations, file search, and memory.
|
|
|
Moving from intent-based bots to proactive AI agents |
openai |
27.03.2025 09:00 |
0.621
|
| Embedding sim. | 0.7219 |
| Entity overlap | 0 |
| Title sim. | 0.0857 |
| Time proximity | 0.8631 |
| NLP тип | other |
| NLP организация | |
| NLP тема | autonomous agents |
| NLP страна | |
Открыть оригинал
Moving from intent-based bots to proactive AI agents.
|