Кластер #3776 - News Clusters

Improving mathematical reasoning with process supervision

closed

Тип события	scientific_publication
Тема	mathematical reasoning
Организация
Страна

Статей	1
Уник. источников	1
Важность / Момент	0.69 / 0
Период	31.05.2023 07:00 — 31.05.2023 07:00
Создан	06.04.2026 05:59:40

Статьи в кластере 1

Заголовок

Источник

Дата публикации

Score

Improving mathematical reasoning with process supervision

openai

31.05.2023 07:00

Embedding sim.	1
Entity overlap	1
Title sim.	1
Time proximity	1

NLP тип	scientific_publication
NLP организация
NLP тема	mathematical reasoning
NLP страна

Открыть оригинал

We've trained a model to achieve a new state-of-the-art in mathematical problem solving by rewarding each correct step of reasoning (“process supervision”) instead of simply rewarding the correct final answer (“outcome supervision”). In addition to boosting performance relative to outcome supervision, process supervision also has an important alignment benefit: it directly trains the model to produce a chain-of-thought that is endorsed by humans.