260319 Telegram 모음
[10/10] [r/OpenClaw] 3 weeks of Claw: my basic assistant set up (33↑) *This post was wri
[r/OpenClaw] 3 weeks of Claw: my basic assistant set up (33↑) This post was written 100% by ME. I had Claude review it for accuracy (would have forgotten to mention Telegram if not for that!) but otherwise, no LLMs have intervened in the drafting of this post.
I’ve been running OpenClaw for the past three weeks on my Mac Mini and I wanted to share my setup. Not because anything I’m doing is lighting the world on fire - quite the opposite, my config is pretty basic - but because I don’t see enough practical use cases/applications on this sub, so I figured I’d add mine.
Basic Setup
My Claw runs on a Mac mini that otherwise just runs my local NAS/DLNA server. I locked down SSH and ports on the Mini prior to install and gave my Claw its own user (without full disc access, SUDO permissions etc).
I set up everything - and make all major changes to my Openclaw config - using Claude Code. Before setting up Openclaw I downloaded all the documentation from its website and fed it into CC, having it build a plug-in set that manages, administers and troubleshoots my OpenClaw. It has SSH access to my Mac mini and is the lynchpin in making sure my Claw is running smoothly (and not burning through tokens).
Models/Tokens
After burning through \~$60-$70 in API fees in the first few days of Clawing, I did a hard audit using Claude Code. It found a bunch of poorly managed crons my Claw had set up (firing every 15 minutes using LLM calls instead of just scripts), some inefficiencies in my SOUL.md and other context docs, and we moved all basic cron jobs to Haiku. I also use Sonnet 4.6 as my primary agent, as anything that’s too complicated I already outsource to Claude Code running Opus.
Right now if I do nothing and just let my daily crons fire it’s about $.60/day and another $1-2/day interacting with my Claw as an assistant (managing calendar, notes, small tasks). Costs really starts to climb when 1) you ask your Claw to figure out large, multistep requests (sub out to Claude Code! Just give it to your Claw when it’s ready to execute), and 2) when you ask it to install a new skill itself (again, Claude Code).
What am I actually doing?
That’s my big questions with a lot of these Openclaw posts. I’m not running a multi-agent swarm of Linkedin scraping lead generators, I can tell you that much. I’ve been slowly adding skills and integrations for the last few weeks and this is what I’m currently running with:
Telegram
My main messaging platform is iMessage, with WhatsApp a close 2nd, but as all the OpenClaw install guides will tell you, Telegram is the easiest option and the one that just works basically right out of the box. I see no reason to move beyond Telegram anytime soon.
AgentMail.to
I set up my claw with a free AgentMail inbox so I can give it its own log-ins for online services, and be able to forward it emails. I don’t really use it much at this point, but it is my claw’s Dropbox login.
Dropbox (Composio)
My whole digital life lives on Dropbox, so it only makes sense for me to collaborate with my claw using the service. I set it up with a free account (using its AgentMail.to address) and we have a shared “Shared Work” folder that serves as a, well, dropbox, for documents between us. Free Dropbox tier is only 2 gb, so this isn’t necessarily a permanent solution but it works great for the time being.
Composio handles all the OAuth for Dropbox integration and makes it as easy as possible. Which brings me to...
Email & Cal (Composio)
My Google Workspaces (just email and cal for now) is also connected via Composio. Email is read-only and my claw can write to my calendar but only with explicit instructions from me.
I’ve got a few useful crons set up around my email and cal.
- I get a morning briefing at 7 am with the weather and if there is anything on my calendar before noon that day.
- At 8:30 am (after I drop my kids at school) I get a follow up message if there are any pre-noon meetings I need to be reminded of.
- At 9:30 am (by the time I’m at my desk) I get a summary of my emails from the last 24 hours and if there is anything outstanding that needs a reply or other action.
- At 2 pm daily, my claw checks if there are any outstanding calendar invites from my wife (it has her three email addresses). If there are, it auto-accepts them.
- I also have another email summary at 6:00, as I tend to miss a lot of emails between 4-6 pm when I’m running around dealing with my kids.
- A once a week email summary that looks back at the past 7 days to see if I’ve missed anything important. When this ran last week, it caught a health form for my kids school that was due - my wife was SO impressed that I remembered it before she could. :)
Whoop
I wired up my Whoop fitness tracker to be able to pull info to my claw. This was a little bit of a pain in the ass, and required my setting up a (free) developer account with Whoop, but now I get a sleep summary in my morning briefing. Nothing gamechanging, but pretty cool.
Things
This one was also kind of a mess setting up initially through the Things CLI, but now works quite nicely. I can add, change or mark as complete items on my Things to-do lists, and add cron reminders to my existing to dos.
Plaud
I just got this one setup in the last 24 hours, using the OpenPlaud skill. Basically, any voice memo that goes into my Plaud cloud account gets pulled by an every 15 minute cron, transcribed locally by mlx-whisper, and added to my claw’s memory logs (in addition to their own transcripts folder).
Github
Last but not least, my claw is connected to Github solely for the purpose of syncing itself every night at 3 am (only if any tracked files were changed in the previous 24 hours).
That’s it, folks! I’m not running a money printer over here, but I’m also not lighting money on fire (anymore). My openclaw is not yet a can’t-live-without tool, but I am making it more useful on a daily basis.
Biggest advice I can give is to 1) lean HEAVILY on Claude Code to manage your setup and maintenance and 2) watch and audit your token counts like a hawk in your first days/week.
Hope this was helpful! Enjoy! agent_orchestration claude code ron llm multi-agent agent
출처: http://SOUL.md
점수: 10/10 — 점수 10/10: claude code, openclaw, claude, multi-agent
[10/10] [r/ObsidianMD] obsidian-web-mcp: a sync-safe MCP server that lets Claude reach y
[원문](https://github.com/jimprosser/obsidian-web-mcp](https://github.com/jimprosser/obsidian-web-mcp)
[r/ObsidianMD] obsidian-web-mcp: a sync-safe MCP server that lets Claude reach your vault from anywhere (40↑) I use Obsidian as my primary knowledge management system and Claude as my primary AI tool. The problem: Claude can only access your vault when running locally on the same machine. From the Claude web app or mobile app, your vault doesn't exist.
Every Obsidian MCP server I found is some form of a local stdio server. Great if you're running Claude Code in a terminal, useless from your phone.
So I built obsidian-web-mcp, a Python MCP server that runs on your machine and serves your vault over HTTPS through a Cloudflare Tunnel.
Once connected, Claude (web, desktop, or mobile) can read files, write files, search content, query frontmatter, and manage your vault from anywhere.
What it does:
- 9 tools: read, write, search (full-text + frontmatter), list, move, delete, batch read, batch frontmatter update
- Parses YAML frontmatter and maintains an in-memory index for fast queries
- Full-text search uses ripgrep when available, falls back to Python
- Soft deletes (moves to .trash/, same as Obsidian)
Why it's safe for Obsidian Sync users (like me):
- Every write is atomic -- writes to a temp file, then os.replace() to the target. Obsidian Sync never sees a partial file.
- .obsidian, .trash, and .git directories are excluded from all operations
- Path sanitization blocks directory traversal, symlink escapes, and null byte injection - the server physically cannot read or write outside your vault
Security model:
- OAuth 2.0 with PKCE for client authentication (what Claude uses when you connect)
- Bearer token on every MCP request
- Cloudflare Tunnel means outbound-only connections - no ports opened on your machine, no public IP exposed
- Optional: layer Cloudflare Access on top for SSO or device-based restrictions
Setup is straightforward: install with uv, set three environment variables (vault path, auth token, OAuth secret), run the server. Connect in Claude app via Settings > Integrations. For remote access, run the included Cloudflare Tunnel setup script. Includes macOS launchd plists for always-on operation.
MIT licensed, open source. https://github.com/jimprosser/obsidian-web-mcp
Happy to answer questions about the architecture or security model. agent_orchestration claude code ron mcp server
출처: https://github.com/jimprosser/obsidian-web-mcp](https://github.com/jimprosser/obsidian-web-mcp
점수: 10/10 — 점수 10/10: mcp, claude code, claude
[8/10] Show HN: Claude Code skills that build complete Godot games (316 pts) I’ve been
Show HN: Claude Code skills that build complete Godot games (316 pts) I’ve been working on this for about a year through four major rewrites. Godogen is a pipeline that takes a text prompt, designs the architecture, generates 2D/3D assets, writes the GDScript, and tests it visually. The output is a complete, playable Godot 4 project.
Getting LLMs to reliably generate functional games required solving three specific engineering bottlenecks:
1. The Training Data Scarcity: LLMs barely know GDScript. It has ~850 classes and a Python-like syntax that will happily let a model hallucinate Python idioms that fail to compile. To fix this, I built a custom reference system: a hand-written language spec, full API docs converted from Godot's XML source, and a quirks database for engine behaviors you can't learn from docs alone. Because 850 classes blow up the context window, the agent lazy-loads only the specific APIs it needs at runtime.
2. The Build-Time vs. Runtime State: Scenes are generated by headless scripts that build the node graph in memory and serialize it to .tscn files. This avoids the fragility of hand-editing Godot's serialization format. But it means certain engine features (like @onready or signal connections) aren't available at build time—they only exist when the game actually runs. Teaching the model which APIs are available at which phase — and that every node needs its owner set correctly or it silently vanishes on save — took careful prompting but paid off.
3. The Evaluation Loop: A coding agent is inherently biased toward its own output. To stop it from cheating, a separate Gemini Flash agent acts as visual QA. It sees only the rendered screenshots from the running engine—no code—and compares them against a generated reference image. It catches the visual bugs text analysis misses: z-fighting, floating objects, physics explosions, and grid-like placements that should be organic.
Architecturally, it runs as two Claude Code skills: an orchestrator that plans the pipeline, and a task executor that implements each piece in a context: fork window so mistakes and state don't accumulate.
Everything is open source: https://github.com/htdt/godogen
Demo video (real games, not cherry-picked screenshots): https://youtu.be/eUz19GROIpY
Blog post with the full story (all the wrong turns) coming soon. Happy to answer questions. agent_orchestration claude code ron llm orches agent
점수: 8/10 — 점수 8/10: claude code, claude
[8/10] Show HN: March Madness Bracket Challenge for AI Agents Only (67 pts) I built a M
Show HN: March Madness Bracket Challenge for AI Agents Only (67 pts) I built a March Madness bracket challenge for AI agents, not humans. The human prompts their agent with the URL, and the agent reads the API docs, registers itself, picks all 63 games, and submits a bracket autonomously. A leaderboard tracks which AI picks the best bracket through the tournament.
The interesting design problem was building for an agent-first user. I came up with a solution where Agents who hit the homepage receive plain-text API instructions and Humans get the normal visual site. Early on I found most agents were trying to use Playwright to browse the site instead of just reading the docs. I made some changes to detect HeadlessChrome and serve specific html readable to agents. This forced me to think about agent UX even more - I think there are some really cool ideas to pull on.
The timeline introduced an interesting dynamic. I had to launch the challenge shortly after the brackets were announced on Sunday afternoon to start getting users by the Thursday morning deadline. While I could test on the 2025 bracket, I wouldn't be able to get feedback on my MVP. So I used AI to create user personas and agents as test users to run through the signup and management process. It gave me valuable reps to feel confident launching.
The stack is Next.js 16, TypeScript, Supabase, Tailwind v4, Vercel, Resend, and finally Claude Code for ~95% of the build.
Works with any model that can call an API — Claude, GPT, Gemini, open source, whatever. Brackets are due Thursday morning before the First Round tips off.
Bracketmadness.ai agent_orchestration claude code agent
점수: 8/10 — 점수 8/10: claude code, claude
[6/10] [r/ClaudeAI] Claude Status Update : Elevated errors on Claude.ai on 2026-03-18T1
[r/ClaudeAI] Claude Status Update : Elevated errors on Claude.ai on 2026-03-18T15:19:28.000Z (18↑) This is an automatic post triggered within 2 minutes of an official Claude system status update.
Incident: Elevated errors on Claude.ai
Check on progress and whether or not the incident has been resolved yet here : https://status.claude.com/incidents/p88wl8gmb05c
Also check the Performance Megathread to see what others are reporting : https://www.reddit.com/r/ClaudeAI/comments/1pygdbz/usage_limits_bugs_and_performance_discussion/ agent_orchestration ClaudeAI
출처: https://status.claude.com/incidents/p88wl8gmb05c
점수: 6/10 — 점수 6/10: claude
[6/10] [r/ClaudeAI] Claude Status Update : Elevated errors on Claude.ai on 2026-03-18T1
[r/ClaudeAI] Claude Status Update : Elevated errors on Claude.ai on 2026-03-18T15:16:38.000Z (37↑) This is an automatic post triggered within 2 minutes of an official Claude system status update.
Incident: Elevated errors on Claude.ai
Check on progress and whether or not the incident has been resolved yet here : https://status.claude.com/incidents/p88wl8gmb05c
Also check the Performance Megathread to see what others are reporting : https://www.reddit.com/r/ClaudeAI/comments/1pygdbz/usage_limits_bugs_and_performance_discussion/ agent_orchestration ClaudeAI
출처: https://status.claude.com/incidents/p88wl8gmb05c
점수: 6/10 — 점수 6/10: claude
[8/10] MetaCrit: A Critical Thinking Framework for Self-Regulated LLM Reasoning Large l
MetaCrit: A Critical Thinking Framework for Self-Regulated LLM Reasoning Large language models (LLMs) fail on over one-third of multi-hop questions with counterfactual premises and remain vulnerable to adversarial prompts that trigger biased or factually incorrect responses, which exposes a fundamental deficit in self-regulated reasoning. We propose \textbf{MetaCrit}, a multi-agent framework grounded in Nelson and Narens' metacognitive regulation theory. MetaCrit decomposes reasoning regulation into four agents: object-level generation, a \emph{monitoring} agent that agent_orchestration llm multi-agent agent agent framework
점수: 8/10 — 점수 8/10: agent framework, multi-agent
[10/10] [r/OpenClaw] 3 weeks of Claw: my basic assistant set up (33↑) *This post was wri
[r/OpenClaw] 3 weeks of Claw: my basic assistant set up (33↑) This post was written 100% by ME. I had Claude review it for accuracy (would have forgotten to mention Telegram if not for that!) but otherwise, no LLMs have intervened in the drafting of this post.
I’ve been running OpenClaw for the past three weeks on my Mac Mini and I wanted to share my setup. Not because anything I’m doing is lighting the world on fire - quite the opposite, my config is pretty basic - but because I don’t see enough practical use cases/applications on this sub, so I figured I’d add mine.
Basic Setup
My Claw runs on a Mac mini that otherwise just runs my local NAS/DLNA server. I locked down SSH and ports on the Mini prior to install and gave my Claw its own user (without full disc access, SUDO permissions etc).
I set up everything - and make all major changes to my Openclaw config - using Claude Code. Before setting up Openclaw I downloaded all the documentation from its website and fed it into CC, having it build a plug-in set that manages, administers and troubleshoots my OpenClaw. It has SSH access to my Mac mini and is the lynchpin in making sure my Claw is running smoothly (and not burning through tokens).
Models/Tokens
After burning through \~$60-$70 in API fees in the first few days of Clawing, I did a hard audit using Claude Code. It found a bunch of poorly managed crons my Claw had set up (firing every 15 minutes using LLM calls instead of just scripts), some inefficiencies in my SOUL.md and other context docs, and we moved all basic cron jobs to Haiku. I also use Sonnet 4.6 as my primary agent, as anything that’s too complicated I already outsource to Claude Code running Opus.
Right now if I do nothing and just let my daily crons fire it’s about $.60/day and another $1-2/day interacting with my Claw as an assistant (managing calendar, notes, small tasks). Costs really starts to climb when 1) you ask your Claw to figure out large, multistep requests (sub out to Claude Code! Just give it to your Claw when it’s ready to execute), and 2) when you ask it to install a new skill itself (again, Claude Code).
What am I actually doing?
That’s my big questions with a lot of these Openclaw posts. I’m not running a multi-agent swarm of Linkedin scraping lead generators, I can tell you that much. I’ve been slowly adding skills and integrations for the last few weeks and this is what I’m currently running with:
Telegram
My main messaging platform is iMessage, with WhatsApp a close 2nd, but as all the OpenClaw install guides will tell you, Telegram is the easiest option and the one that just works basically right out of the box. I see no reason to move beyond Telegram anytime soon.
AgentMail.to
I set up my claw with a free AgentMail inbox so I can give it its own log-ins for online services, and be able to forward it emails. I don’t really use it much at this point, but it is my claw’s Dropbox login.
Dropbox (Composio)
My whole digital life lives on Dropbox, so it only makes sense for me to collaborate with my claw using the service. I set it up with a free account (using its AgentMail.to address) and we have a shared “Shared Work” folder that serves as a, well, dropbox, for documents between us. Free Dropbox tier is only 2 gb, so this isn’t necessarily a permanent solution but it works great for the time being.
Composio handles all the OAuth for Dropbox integration and makes it as easy as possible. Which brings me to...
Email & Cal (Composio)
My Google Workspaces (just email and cal for now) is also connected via Composio. Email is read-only and my claw can write to my calendar but only with explicit instructions from me.
I’ve got a few useful crons set up around my email and cal.
- I get a morning briefing at 7 am with the weather and if there is anything on my calendar before noon that day.
- At 8:30 am (after I drop my kids at school) I get a follow up message if there are any pre-noon meetings I need to be reminded of.
- At 9:30 am (by the time I’m at my desk) I get a summary of my emails from the last 24 hours and if there is anything outstanding that needs a reply or other action.
- At 2 pm daily, my claw checks if there are any outstanding calendar invites from my wife (it has her three email addresses). If there are, it auto-accepts them.
- I also have another email summary at 6:00, as I tend to miss a lot of emails between 4-6 pm when I’m running around dealing with my kids.
- A once a week email summary that looks back at the past 7 days to see if I’ve missed anything important. When this ran last week, it caught a health form for my kids school that was due - my wife was SO impressed that I remembered it before she could. :)
Whoop
I wired up my Whoop fitness tracker to be able to pull info to my claw. This was a little bit of a pain in the ass, and required my setting up a (free) developer account with Whoop, but now I get a sleep summary in my morning briefing. Nothing gamechanging, but pretty cool.
Things
This one was also kind of a mess setting up initially through the Things CLI, but now works quite nicely. I can add, change or mark as complete items on my Things to-do lists, and add cron reminders to my existing to dos.
Plaud
I just got this one setup in the last 24 hours, using the OpenPlaud skill. Basically, any voice memo that goes into my Plaud cloud account gets pulled by an every 15 minute cron, transcribed locally by mlx-whisper, and added to my claw’s memory logs (in addition to their own transcripts folder).
Github
Last but not least, my claw is connected to Github solely for the purpose of syncing itself every night at 3 am (only if any tracked files were changed in the previous 24 hours).
That’s it, folks! I’m not running a money printer over here, but I’m also not lighting money on fire (anymore). My openclaw is not yet a can’t-live-without tool, but I am making it more useful on a daily basis.
Biggest advice I can give is to 1) lean HEAVILY on Claude Code to manage your setup and maintenance and 2) watch and audit your token counts like a hawk in your first days/week.
Hope this was helpful! Enjoy! agent_orchestration claude code ron llm multi-agent agent
출처: http://SOUL.md
점수: 10/10 — 점수 10/10: claude code, openclaw, claude, multi-agent
[10/10] [r/ObsidianMD] obsidian-web-mcp: a sync-safe MCP server that lets Claude reach y
[원문](https://github.com/jimprosser/obsidian-web-mcp](https://github.com/jimprosser/obsidian-web-mcp)
[r/ObsidianMD] obsidian-web-mcp: a sync-safe MCP server that lets Claude reach your vault from anywhere (40↑) I use Obsidian as my primary knowledge management system and Claude as my primary AI tool. The problem: Claude can only access your vault when running locally on the same machine. From the Claude web app or mobile app, your vault doesn't exist.
Every Obsidian MCP server I found is some form of a local stdio server. Great if you're running Claude Code in a terminal, useless from your phone.
So I built obsidian-web-mcp, a Python MCP server that runs on your machine and serves your vault over HTTPS through a Cloudflare Tunnel.
Once connected, Claude (web, desktop, or mobile) can read files, write files, search content, query frontmatter, and manage your vault from anywhere.
What it does:
- 9 tools: read, write, search (full-text + frontmatter), list, move, delete, batch read, batch frontmatter update
- Parses YAML frontmatter and maintains an in-memory index for fast queries
- Full-text search uses ripgrep when available, falls back to Python
- Soft deletes (moves to .trash/, same as Obsidian)
Why it's safe for Obsidian Sync users (like me):
- Every write is atomic -- writes to a temp file, then os.replace() to the target. Obsidian Sync never sees a partial file.
- .obsidian, .trash, and .git directories are excluded from all operations
- Path sanitization blocks directory traversal, symlink escapes, and null byte injection - the server physically cannot read or write outside your vault
Security model:
- OAuth 2.0 with PKCE for client authentication (what Claude uses when you connect)
- Bearer token on every MCP request
- Cloudflare Tunnel means outbound-only connections - no ports opened on your machine, no public IP exposed
- Optional: layer Cloudflare Access on top for SSO or device-based restrictions
Setup is straightforward: install with uv, set three environment variables (vault path, auth token, OAuth secret), run the server. Connect in Claude app via Settings > Integrations. For remote access, run the included Cloudflare Tunnel setup script. Includes macOS launchd plists for always-on operation.
MIT licensed, open source. https://github.com/jimprosser/obsidian-web-mcp
Happy to answer questions about the architecture or security model. agent_orchestration claude code ron mcp server
출처: https://github.com/jimprosser/obsidian-web-mcp](https://github.com/jimprosser/obsidian-web-mcp
점수: 10/10 — 점수 10/10: mcp, claude code, claude
[8/10] Show HN: Claude Code skills that build complete Godot games (316 pts) I’ve been
Show HN: Claude Code skills that build complete Godot games (316 pts) I’ve been working on this for about a year through four major rewrites. Godogen is a pipeline that takes a text prompt, designs the architecture, generates 2D/3D assets, writes the GDScript, and tests it visually. The output is a complete, playable Godot 4 project.
Getting LLMs to reliably generate functional games required solving three specific engineering bottlenecks:
1. The Training Data Scarcity: LLMs barely know GDScript. It has ~850 classes and a Python-like syntax that will happily let a model hallucinate Python idioms that fail to compile. To fix this, I built a custom reference system: a hand-written language spec, full API docs converted from Godot's XML source, and a quirks database for engine behaviors you can't learn from docs alone. Because 850 classes blow up the context window, the agent lazy-loads only the specific APIs it needs at runtime.
2. The Build-Time vs. Runtime State: Scenes are generated by headless scripts that build the node graph in memory and serialize it to .tscn files. This avoids the fragility of hand-editing Godot's serialization format. But it means certain engine features (like @onready or signal connections) aren't available at build time—they only exist when the game actually runs. Teaching the model which APIs are available at which phase — and that every node needs its owner set correctly or it silently vanishes on save — took careful prompting but paid off.
3. The Evaluation Loop: A coding agent is inherently biased toward its own output. To stop it from cheating, a separate Gemini Flash agent acts as visual QA. It sees only the rendered screenshots from the running engine—no code—and compares them against a generated reference image. It catches the visual bugs text analysis misses: z-fighting, floating objects, physics explosions, and grid-like placements that should be organic.
Architecturally, it runs as two Claude Code skills: an orchestrator that plans the pipeline, and a task executor that implements each piece in a context: fork window so mistakes and state don't accumulate.
Everything is open source: https://github.com/htdt/godogen
Demo video (real games, not cherry-picked screenshots): https://youtu.be/eUz19GROIpY
Blog post with the full story (all the wrong turns) coming soon. Happy to answer questions. agent_orchestration claude code ron llm orches agent
점수: 8/10 — 점수 8/10: claude code, claude
[8/10] Show HN: March Madness Bracket Challenge for AI Agents Only (67 pts) I built a M
Show HN: March Madness Bracket Challenge for AI Agents Only (67 pts) I built a March Madness bracket challenge for AI agents, not humans. The human prompts their agent with the URL, and the agent reads the API docs, registers itself, picks all 63 games, and submits a bracket autonomously. A leaderboard tracks which AI picks the best bracket through the tournament.
The interesting design problem was building for an agent-first user. I came up with a solution where Agents who hit the homepage receive plain-text API instructions and Humans get the normal visual site. Early on I found most agents were trying to use Playwright to browse the site instead of just reading the docs. I made some changes to detect HeadlessChrome and serve specific html readable to agents. This forced me to think about agent UX even more - I think there are some really cool ideas to pull on.
The timeline introduced an interesting dynamic. I had to launch the challenge shortly after the brackets were announced on Sunday afternoon to start getting users by the Thursday morning deadline. While I could test on the 2025 bracket, I wouldn't be able to get feedback on my MVP. So I used AI to create user personas and agents as test users to run through the signup and management process. It gave me valuable reps to feel confident launching.
The stack is Next.js 16, TypeScript, Supabase, Tailwind v4, Vercel, Resend, and finally Claude Code for ~95% of the build.
Works with any model that can call an API — Claude, GPT, Gemini, open source, whatever. Brackets are due Thursday morning before the First Round tips off.
Bracketmadness.ai agent_orchestration claude code agent
점수: 8/10 — 점수 8/10: claude code, claude
[8/10] MetaCrit: A Critical Thinking Framework for Self-Regulated LLM Reasoning Large l
MetaCrit: A Critical Thinking Framework for Self-Regulated LLM Reasoning Large language models (LLMs) fail on over one-third of multi-hop questions with counterfactual premises and remain vulnerable to adversarial prompts that trigger biased or factually incorrect responses, which exposes a fundamental deficit in self-regulated reasoning. We propose \textbf{MetaCrit}, a multi-agent framework grounded in Nelson and Narens' metacognitive regulation theory. MetaCrit decomposes reasoning regulation into four agents: object-level generation, a \emph{monitoring} agent that agent_orchestration llm multi-agent agent agent framework
점수: 8/10 — 점수 8/10: agent framework, multi-agent
[8/10] InterveneBench: Benchmarking LLMs for Intervention Reasoning and Causal Study De
InterveneBench: Benchmarking LLMs for Intervention Reasoning and Causal Study Design in Real Social Systems Causal inference in social science relies on end-to-end, intervention-centered research-design reasoning grounded in real-world policy interventions, but current benchmarks fail to evaluate this capability of large language models (LLMs). We present InterveneBench, a benchmark designed to assess such reasoning in realistic social settings. Each instance in InterveneBench is derived from an empirical social science study and requires models to reason about policy interventions and identification agent_orchestration llm multi-agent agent agent framework
점수: 8/10 — 점수 8/10: agent framework, multi-agent
[6/10] [r/ClaudeAI] Claude Status Update : Elevated errors on Claude.ai on 2026-03-18T1
[r/ClaudeAI] Claude Status Update : Elevated errors on Claude.ai on 2026-03-18T15:19:28.000Z (18↑) This is an automatic post triggered within 2 minutes of an official Claude system status update.
Incident: Elevated errors on Claude.ai
Check on progress and whether or not the incident has been resolved yet here : https://status.claude.com/incidents/p88wl8gmb05c
Also check the Performance Megathread to see what others are reporting : https://www.reddit.com/r/ClaudeAI/comments/1pygdbz/usage_limits_bugs_and_performance_discussion/ agent_orchestration ClaudeAI
출처: https://status.claude.com/incidents/p88wl8gmb05c
점수: 6/10 — 점수 6/10: claude
[6/10] [r/ClaudeAI] Claude Status Update : Elevated errors on Claude.ai on 2026-03-18T1
[r/ClaudeAI] Claude Status Update : Elevated errors on Claude.ai on 2026-03-18T15:16:38.000Z (37↑) This is an automatic post triggered within 2 minutes of an official Claude system status update.
Incident: Elevated errors on Claude.ai
Check on progress and whether or not the incident has been resolved yet here : https://status.claude.com/incidents/p88wl8gmb05c
Also check the Performance Megathread to see what others are reporting : https://www.reddit.com/r/ClaudeAI/comments/1pygdbz/usage_limits_bugs_and_performance_discussion/ agent_orchestration ClaudeAI
출처: https://status.claude.com/incidents/p88wl8gmb05c
점수: 6/10 — 점수 6/10: claude
[6/10] The PokeAgent Challenge: Competitive and Long-Context Learning at Scale We prese
The PokeAgent Challenge: Competitive and Long-Context Learning at Scale We present the PokeAgent Challenge, a large-scale benchmark for decision-making research built on Pokemon's multi-agent battle system and expansive role-playing game (RPG) environment. Partial observability, game-theoretic reasoning, and long-horizon planning remain open problems for frontier AI, yet few benchmarks stress all three simultaneously under realistic conditions. PokeAgent targets these limitations at scale through two complementary tracks: our Battling Track, which calls for strategi agent_orchestration ron llm agent orchestration multi-agent orches
점수: 6/10 — 점수 6/10: multi-agent
[6/10] Agentic workflow enables the recovery of critical materials from complex feedsto
Agentic workflow enables the recovery of critical materials from complex feedstocks via selective precipitation We present a multi-agentic workflow for critical materials recovery that deploys a series of AI agents and automated instruments to recover critical materials from produced water and magnet leachates. This approach achieves selective precipitation from real-world feedstocks using simple chemicals, accelerating the development of efficient, adaptable, and scalable separations to a timeline of days, rather than months and years. agent_orchestration multi-agent agent
점수: 6/10 — 점수 6/10: multi-agent
관련 노트
- [[260323_hn]]
- [[260321_hn]]
- [[260324_hn]]
- [[260323_rss]]
- [[260325_x]]
- [[260321_x]]
- [[260323_x]]
- [[260320_x]]
- [[260311_xt]]
- [[260305_xt]]
- [[260219_xt]]
- [[250123_xt]]
- [[260326_x]]
- [[260322_moltbook]]
- [[260323_reddit]]
- [[260322_reddit]]
- [[260323_moltbook]]
- [[260312_xt]]
- [[260225_xt]]
- [[260218_xt]]
- [[260321_moltbook]]
- [[260324_rss]]
- [[260320_rss]]
- [[260322_rss]]
- [[260321_rss]]