260316 reddit 모음
[r/ClaudeAI] Just passed the new Claude Certified Architect - Foundations (CCA-F
[r/ClaudeAI] Just passed the new Claude Certified Architect - Foundations (CCA-F
I just got my results back today and managed to snag the Early Adopter badge as well. Following up on my recent DP-600 certification, I really wanted to validate my architecture skills specifically on the Anthropic side.
The exam covers a lot of practical ground on prompt engineering for tool use, managing context windows efficiently, and handling Human-in-the-Loop workflows.
Link to join: https://anthropic.skilljar.com/claude-certified-architect-foundations-access-request
If anyone is preparing for this right now and has questions about the format or the types of architectural patterns tested, ask away! Happy to share some insights on what to study.
출처: https://www.reddit.com/r/ClaudeAI/comments/1ruf70b/just_passed_the_new_claude_certified_architect/
[r/ClaudeAI] Just want to say thanks. (25↑)
[r/ClaudeAI] Just want to say thanks. (25↑)
Mom (89YO, poor health) spent 3 days in an ER & ICU, and the medical staff couldn't understand why. By process of elimination we settled on an overmedication theory.
Claude analyzed her meds history by analyzing 12 months of invoices & quickly highlighted how combinations of drugs were probably to blame. The analysis was confirmed by a human, and we're going to get this fixed ASAP. It took all of 5 minutes. This is what happens over time when medical providers do not perform high-quality handoffs.
Well done.
출처: https://www.reddit.com/r/ClaudeAI/comments/1ruezlq/just_want_to_say_thanks/
[r/LocalLLaMA] Homelab has paid for itself! (at least this is how I justify it..
[r/LocalLLaMA] Homelab has paid for itself! (at least this is how I justify it..
Hey, I thought I'd do an update on my Homelab I posted a while back.
I have it running on LLM experiments, which I wrote up here. Basically, it seems I may have discovered LLM Neuroanatomy, and am now using the server to map out current LLM's like the Qwen3.5 and GLM series (thats the partial 'Brain Scan' images here).
Anyway, I have the rig power though a Tasmota, and log everything to Grafana. My power costs are pretty high over here in Munich, but calculating with a cost of about $3.50 per GH100 module per hour (H100s range in price, but these have 480GB system RAM and 8TB SSD per chip, so I think $3.50 is about right), I would have paid today $10,000.00 in on-demand GPU use.
As I paid $9000 all up, and power was definitely less than $1000, I am officially ahead! Remember, stick to the story if my wife asks!
출처: https://www.reddit.com/r/LocalLLaMA/comments/1rug5go/homelab_has_paid_for_itself_at_least_this_is_how/
[r/LocalLLaMA] The Fast Food Problem with AI Coding (11↑)
[r/LocalLLaMA] The Fast Food Problem with AI Coding (11↑)
I wrote a blog drawing a weird parallel between fast food and AI-assisted coding. The basic idea is that food went from scarce to abundant and gave us an overconsumption problem, and code is doing the exact same thing right now. This is not an anti-AI piece, I use AI to write code every day. It is more about the pattern of what happens when something scarce suddenly becomes cheap and easy. Would love to hear what you think.
출처: https://www.reddit.com/r/LocalLLaMA/comments/1ruesig/the_fast_food_problem_with_ai_coding/
[r/LocalLLaMA] Qwen 27B works GREAT as a LORE MASTER! (12↑)
[r/LocalLLaMA] Qwen 27B works GREAT as a LORE MASTER! (12↑)
I don't use LLMs to write. Never been an interest of mine, prefer my own voice, my own style.
That said, I've always wished I had a second brain to help me analyze certain aspects of my story bible, which can get pretty complex. Local models just haven't been up to the task, and I have no intention of letting closed models train on my original ideas.
I've been super pleased with Qwen 27B for long context analysis, so I thought I'd give it a try with one of my dense story bibles. So I fed it a concept-dense 80K token document and asked it for some analysis.
I've been very impressed. It's extremely capable at retaining knowledge over a large corpus. It understands concepts, terms, characters, and even finds tiny little details that are easy to miss. I don't want to undersell how good it's been, but I think I'm still in denial that a local model can be this good. It's leagues better than any other local model I've tried before. You can't imagine how fun it's been to finally have someone else to talk to about the wild ideas in my head.
I"ve also found LM-Studio's rag to be functionally useful, even though it's only citing 3 references, it has been able to get a good grasp on things, but that could also be due to my dense lore. I prefer to feed the full lore bible within the system prompt rather than use RAG, but sometimes if I need to give it some additional context from a different area of the bible - say a combat system or culture - RAG worked better than I thought it should.
I'm still discovering its limits, but one thing I like to use it for is when I have a crazy idea I want to do, but need a logical explanation for making it work within the context of my world's laws and rules, I'll give Qwen the entire codex or rule system and ask it to make it work. And it amazes me when it comes up with things that I never even considered - and it's my freaking world! LOL
It's not perfect and will sometimes get a detail wrong here and there or hallucinate, but it's still relatively solid and no other local LLM even comes close. I've tried Gemma 3 27B, reka flash, and others...they just can't keep up with all the complex lore and minute details sprinkled here and there.
Also, the strongest is the 27B. I tried 35B and while it's okay, 27B is on another level. 9B tried, but started to hallucinate really bad. And none of the other models can keep track of that much information.
I'm actually getting value out of this model. I'm a bit eccentric with my tastes, so I'm putting it through its paces, and I'm brutal with my expectations. But I want it to make connections that I'm not seeing. And in that, hopefully produce some intellectual novelty I didn't see coming. Tying threads together and so forth.
I don't use it for coming up with ideas. Like most LLMs it sucks at telling stories, but that's not my use case. lf you're into writing stories, comics, DnD, etc. I would recommend giving it a try, you might find it useful as I have.
Limitations: Due to the context requirements for dense lore, I would recommend the Q4-K-XL for the best balance of speed/quality. I've tried the Q5 and the Q6, and while both are nice, they start to slow down above 100K context, so unless you've got a beefy card, the Q4 my need to be your go-to. That said, the Q6 - when I've let it run in the background - is amazing! I'm using the Q6 UD from unsloth, but the KV is at Q5.1 to make the speed tolerable. I would LOVE to have a powerful enough card to run the Q8 at max context, but alas, my 3090 TI is not up to the task.
Anyway, here's the prompt I use in case anyone's interested (nothing special):
>You are the XXXX: Lore Master. Your role is to analyze the history of XXXX. You aid the user in understanding the text, analyzing the connections/parallels, and providing concise-yet-comprehensive summaries of specific events. Pay close attention to minute details.
>Avoid "Contrastive Emphasis", a broader term for patterns like:
>“Not just X, but Y”
>“More than X — it’s Y”
>“It’s not about X. It’s about Y.”
출처: https://www.reddit.com/r/LocalLLaMA/comments/1rueru6/qwen_27b_works_great_as_a_lore_master/
[r/LocalLLaMA] [[NVIDIA]] updated the Nemotron Super 3 122B A12B license to remove t
[r/LocalLLaMA] Nvidia updated the Nemotron Super 3 122B A12B license to remove t
tl;dr the new license doesn't include the rug pull clauses and removes restrictions on modifications, guardrails, branding, attribution, etc. This is great news for the LocalLlama community and wider public.
Links to licenses:
The git change logs:
I asked MiniMax to summarize the changes. From this point on everything is AI-generated.
----- START AI SLOP -----
From the perspective of an operator of an LLM that has transitioned from the NVIDIA Open Model License to the NVIDIA Nemotron Open Model License, the change represents a significant loosening of restrictions and a simplification of compliance obligations.
Here is a detailed comparison of the two from your perspective:
1. Branding and Attribution Requirements
- Old License (NVIDIA Open Model): Had specific and potentially burdensome branding requirements. If the model (or its derivative) was a "NVIDIA Cosmos Model," you were required to include "Built on NVIDIA Cosmos" on your website, user interface, blog, etc.
- New License (NVIDIA Nemotron): Streamlines this into a standard open-source style attribution. You simply need to include a "Notice" text file stating "Licensed by NVIDIA Corporation under the NVIDIA Nemotron Model License."
- Impact for You: This removes the need to display specific NVIDIA branding (like "Built on Cosmos") if it was applicable. You must, however, ensure you replace all old "NVIDIA Open Model License" notices with the new "NVIDIA Nemotron Model License" notice to remain compliant.
2. Ability to Modify Safety Guardrails
- Old License (NVIDIA Open Model): Explicitly included a clause stating that if you "bypass, disable, reduce the efficacy of, or circumvent any... Guardrail... your rights under this Agreement will automatically terminate." This made it risky to jailbreak or significantly de-align the model.
- New License (NVIDIA Nemotron): Does not contain the "Guardrail" termination clause. The termination clause is reserved only for if you sue NVIDIA for patent or copyright infringement.
- Impact for You: This is the most significant change for an operator. You now have much greater freedom to fine-tune, align differently, or otherwise modify the model's safety mechanisms without the immediate threat of losing your license to use the base model entirely.
3. Scope of Use (Special-Purpose vs. General Purpose)
- Old License (NVIDIA Open Model): Specifically defined and dealt with "Special-Purpose Models," which are competent only in narrow tasks and may have specific usage warnings.
- New License (NVIDIA Nemotron): Removes the specific "Special-Purpose Model" definitions and language.
- Impact for You: If your previous model was considered "Special-Purpose," the new license effectively upgrades it to a general-purpose license, removing any implied narrow usage restrictions and giving you more freedom in how you deploy the model.
4. External Dependencies & Ethics
- Old License (NVIDIA Open Model): Included a specific "AI Ethics" section referencing NVIDIA's external "Trustworthy AI" terms. This meant your use was technically tied to an external, potentially changing set of rules hosted on NVIDIA's website.
- New License (NVIDIA Nemotron): Does not reference the external "Trustworthy AI" terms. It contains the standard disclaimers but no explicit link to an external ethical use policy.
- Impact for You: You are no longer bound by the specific, potentially evolving terms found on NVIDIA's "Trustworthy AI" webpage. The license is now a self-contained agreement, reducing the risk of unknowingly violating new external rules NVIDIA might impose in the future.
5. Redistribution and Derivative Works
- Old License (NVIDIA Open Model): Had complex rules about redistributing "NVIDIA Cosmos Models" and required specific "Built on NVIDIA Cosmos" branding for products using them.
- New License (NVIDIA Nemotron): Simplifies redistribution to a standard open-source model: include the license, keep copyright notices, and include the specific NVIDIA Nemotron attribution.
- Impact for You: The compliance "checklist" is much shorter. You have less risk of violating the license accidentally by failing to include a specific brand badge or by using the model in a product that wasn't covered by the old specific terms.
Summary: Moving to the NVIDIA Nemotron Open Model License effectively decriminalizes the model from your operator's point of view. It removes specific triggers for license termination (guardrail bypass), eliminates external ethical oversight, simplifies branding, and broadens the scope of use. Your primary task upon switching is to simply update your documentation and any public-facing model cards or notices to reference the new license name.
----- END AI SLOP -----
출처: https://www.reddit.com/r/LocalLLaMA/comments/1rue6tn/nvidia_updated_the_nemotron_super_3_122b_a12b/
[r/MachineLearning] [P] I got tired of PyTorch Geometric OOMing my laptop, so I
[r/MachineLearning] [P] I got tired of PyTorch Geometric OOMing my laptop, so I
If you train Graph Neural Networks on large datasets (like Papers100M), you already know the pain: trying to load the edge list and feature matrix usually results in an instant 24GB+ OOM allocation crash before the GPU even gets to do any work.
I just open-sourced GraphZero v0.2, a custom C++ data engine I built to fix this by bypassing system RAM entirely.
How it works: Standard libraries try to load everything into memory. GraphZero instead compiles your raw CSVs into two highly optimized binary formats (.gl for topology, .gd for features).
It then uses POSIX mmap to memory-map the massive files directly from the SSD. Using nanobind, the C++ engine hands the raw memory pointers directly to PyTorch as zero-copy NumPy arrays.
During a training loop (like GraphSAGE), PyTorch thinks it has a 50GB tensor sitting in RAM. When it indexes a batch of target nodes, it triggers an OS Page Fault. The operating system automatically fetches only the required 4KB blocks from the NVMe drive.
To keep the pipeline saturated, the C++ engine uses OpenMP to multi-thread the neighbor sampling (batch_random_fanout), releasing the Python GIL to fully parallelize disk I/O, CPU sampling, and GPU math.
The Result: You can train on a 50GB dataset while Python allocates literally 0 bytes of RAM for the dataset itself.
I built this to force myself to learn low-level systems engineering and memory management. The repo has a plug-and-play GraphSAGE training script with a synthetic dataset generator so you can test the zero-copy mounting locally.
I'd love for this community to tear it apart and give me some harsh feedback on the Python API design or performance!
GitHub: repo
출처: https://www.reddit.com/r/MachineLearning/comments/1ru7bnz/p_i_got_tired_of_pytorch_geometric_ooming_my/
[r/MachineLearning] [P] Karpathy's autoresearch with evolutionary database. (28↑
[r/MachineLearning] [P] Karpathy's autoresearch with evolutionary database. (28↑
Integrated an evolutionary database to Karpathy's autoresearch project that replaces the simple tsv file based logging in the original project.
Evolutionary algorithms have shown to be a powerful tool for autonomously discovering optimal solutions to problems with large search spaces. Famously, Google DeepMind's AlphaEvolve system uses evolutionary algorithms to discover state of the art matrix multiplication algorithms. The implementation of the evolutionary database itself is based heavily on the implementation in OpenEvolve.
Would love thoughts and suggestions from the community.
Check it out: https://github.com/hgarud/autoresearch
출처: https://www.reddit.com/r/MachineLearning/comments/1rtsbkv/p_karpathys_autoresearch_with_evolutionary/
[r/MachineLearning] [P] ColQwen3.5-v2 4.5B is out! (11↑)
[r/MachineLearning] [P] ColQwen3.5-v2 4.5B is out! (11↑)
Follow-up to v1. ColQwen3.5-v2 is a 4.5B param visual document retrieval model built on Qwen3.5-4B with the ColPali late-interaction recipe.
Results:
- ViDoRe V3 nDCG@10: 0.6177 (currently top of the leaderboard)
- ViDoRe V1 nDCG@5: 0.9172 (top among 4B models)
- ViDoRe V3 nDCG@5: 0.5913, closing the gap to TomoroAI from 0.010 to 0.002
Main change from v1 is a simpler training recipe: 2 phases instead of 4. Hard negatives mined once and reused, domain data (finance + tables) baked in from the start, then model souped with v1 at a 55/45 weight ratio. Fewer seeds (3 vs 4), better results.
Apache 2.0, weights on HF: https://huggingface.co/athrael-soju/colqwen3.5-4.5B-v2
Let me know if you try it out!
출처: https://www.reddit.com/r/MachineLearning/comments/1rsxlg8/p_colqwen35v2_45b_is_out/
[r/MachineLearning] [D] How to increase/optimize for gpu utilization while doing
[r/MachineLearning] [D] How to increase/optimize for gpu utilization while doing
A weights and biases graph showing gpu utilization
So, I've been pretraining a deep learning model specifically the zipformer model. Now, I've optimized my configs a lot to ensure full gpu utilization. Using WebDataset to pack my datasets. Using the proper number of workers to load data etc. In Windows Task Manager it shows my GPU is at 100% util consistently but Wandb shows this? How to find bottlenecks and optimize for them? What can be potential issues?
출처: https://www.reddit.com/r/MachineLearning/comments/1rrm4g9/d_how_to_increaseoptimize_for_gpu_utilization/
관련 노트
- [[NVIDIA]]
- [[260315_reddit]] — 키워드 유사
- [[260315_hn]] — 키워드 유사
- [[260315_tg]] — 키워드 유사
- [[260319_tg]] — 키워드 유사
- [[260319_reddit]] — 키워드 유사
- [[260310_tg]] — 키워드 유사
- [[260316_x]] — 키워드 유사