260318 Telegram 모음
[8/10] InterveneBench: Benchmarking LLMs for Intervention Reasoning and Causal Study De
InterveneBench: Benchmarking LLMs for Intervention Reasoning and Causal Study Design in Real Social Systems Causal inference in social science relies on end-to-end, intervention-centered research-design reasoning grounded in real-world policy interventions, but current benchmarks fail to evaluate this capability of large language models (LLMs). We present InterveneBench, a benchmark designed to assess such reasoning in realistic social settings. Each instance in InterveneBench is derived from an empirical social science study and requires models to reason about policy interventions and identification agent_orchestration llm multi-agent agent agent framework
점수: 8/10 — 점수 8/10: agent framework, multi-agent
[6/10] The PokeAgent Challenge: Competitive and Long-Context Learning at Scale We prese
The PokeAgent Challenge: Competitive and Long-Context Learning at Scale We present the PokeAgent Challenge, a large-scale benchmark for decision-making research built on Pokemon's multi-agent battle system and expansive role-playing game (RPG) environment. Partial observability, game-theoretic reasoning, and long-horizon planning remain open problems for frontier AI, yet few benchmarks stress all three simultaneously under realistic conditions. PokeAgent targets these limitations at scale through two complementary tracks: our Battling Track, which calls for strategi agent_orchestration ron llm agent orchestration multi-agent orches
점수: 6/10 — 점수 6/10: multi-agent
[6/10] Agentic workflow enables the recovery of critical materials from complex feedsto
Agentic workflow enables the recovery of critical materials from complex feedstocks via selective precipitation We present a multi-agentic workflow for critical materials recovery that deploys a series of AI agents and automated instruments to recover critical materials from produced water and magnet leachates. This approach achieves selective precipitation from real-world feedstocks using simple chemicals, accelerating the development of efficient, adaptable, and scalable separations to a timeline of days, rather than months and years. agent_orchestration multi-agent agent
점수: 6/10 — 점수 6/10: multi-agent
[7/10] 대한민국, 대체 납사 수입원 확보에 나서다. 원문: SOUTH KOREA MOVES TO SECURE ALTERNATIVE NAPHTHA IM
대한민국, 대체 납사 수입원 확보에 나서다.
원문: SOUTH KOREA MOVES TO SECURE ALTERNATIVE NAPHTHA IMPORT SOURCES. ...
출처: @marketfeed 번역제공: 구글
출처: https://x.com/FirstSquawk/status/2034046356114010367
점수: 7/10 — 점수 7/10: etf
[6/10] 카타르 빈자리를 못 채우는 구조에서 수혜 정리: LNG 수출사(Cheniere 등), LNG 선사(운임 급등), LNG 인프라(건설·EPC),
카타르 빈자리를 못 채우는 구조에서 수혜 정리: LNG 수출사(Cheniere 등), LNG 선사(운임 급등), LNG 인프라(건설·EPC), 대체 에너지원(석탄·원자력·재생), 한국 관련 수혜/피해(조선·가스공사 등). 최적 포트폴리오와 휴전 시나리오 분석.
점수: 6/10 — 점수 6/10: 포트폴리오, 분석
[7/10] 대한민국, 대체 납사 수입원 확보에 나서다. 원문: SOUTH KOREA MOVES TO SECURE ALTERNATIVE NAPHTHA IM
대한민국, 대체 납사 수입원 확보에 나서다.
원문: SOUTH KOREA MOVES TO SECURE ALTERNATIVE NAPHTHA IMPORT SOURCES. ...
출처: @marketfeed 번역제공: 구글
출처: https://x.com/FirstSquawk/status/2034046356114010367
점수: 7/10 — 점수 7/10: etf
[6/10] 카타르 빈자리를 못 채우는 구조에서 수혜 정리: LNG 수출사(Cheniere 등), LNG 선사(운임 급등), LNG 인프라(건설·EPC),
카타르 빈자리를 못 채우는 구조에서 수혜 정리: LNG 수출사(Cheniere 등), LNG 선사(운임 급등), LNG 인프라(건설·EPC), 대체 에너지원(석탄·원자력·재생), 한국 관련 수혜/피해(조선·가스공사 등). 최적 포트폴리오와 휴전 시나리오 분석.
점수: 6/10 — 점수 6/10: 포트폴리오, 분석