← Все кластеры
Improving mathematical reasoning with process supervision
closed
Тип событияscientific_publication
Темаmathematical reasoning
Организация
Страна
Статей1
Уник. источников1
Важность / Момент0.69 / 0
Период31.05.2023 07:00 — 31.05.2023 07:00
Создан06.04.2026 05:59:40
Статьи в кластере 1
Заголовок Источник Дата публикации Score
S Improving mathematical reasoning with process supervision openai 31.05.2023 07:00 1
Embedding sim.1
Entity overlap1
Title sim.1
Time proximity1
NLP типscientific_publication
NLP организация
NLP темаmathematical reasoning
NLP страна

Открыть оригинал

We've trained a model to achieve a new state-of-the-art in mathematical problem solving by rewarding each correct step of reasoning (“process supervision”) instead of simply rewarding the correct final answer (“outcome supervision”). In addition to boosting performance relative to outcome supervision, process supervision also has an important alignment benefit: it directly trains the model to produce a chain-of-thought that is endorsed by humans.