Posts
Papers
Python
Papers
Souper-Model How Simple Arithmetic Unlocks State-of-the-Art LLM Performance
Black-Box On-Policy Distillation of Large Language Models
Depth Anything 3 Recovering the Visual Space from Any Views
LeJEPA Provable and Scalable Self-Supervised Learning Without the Heuristics
Scaling Latent Reasoning via Looped Language Models
Kosmos An AI Scientist for Autonomous Discovery
Emu3.5 Native Multimodal Models are World Learners
Context Engineering 2.0 - The Context of Context Engineering
Kimi Linear An Expressive, Efficient Attention Architecture
Exploring Conditions for Diffusion Models in Robotic Control
A Survey of Data Agents Emerging Paradigm or Overstated Hype
Real Deep Research for AI, Robotics and Beyond
The Free Transformer
A Definition of AGI
FineVision Open Data Is All You Need
DeepSeek-OCR Contexts Optical Compression
Detect Anything via Next Point Prediction
MCPMark A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use
The Dragon Hatchling The Missing Link between the Transformer and Models of the Brain
VFF-Net Evolving forward–forward algorithms into convolutional neural networks for enhanced computational insights
Diffusion Transformers with Representation Autoencoders
Training-Free Group Relative Policy Optimization
Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought
Meta-Awareness Enhances Reasoning Models Self-Alignment Reinforcement Learning
Agent Learning via Early Experience
FAST-DLLM V2 Efficient Block-Diffusion LLM
Less is More Recursive Reasoning with Tiny Networks
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation
CoDA Agentic Systems for Collaborative Data Visualization
Video models are zero-shot learners and reasoners
Soft Tokens, Hard Truths
Sharing is Caring Efficient LM Post-Training with Collective RL Experience Sharing
Why Language Models Hallucinate
Hunyuan3D Studio End-to-End AI Pipeline for Game-Ready 3D Asset Generation
DINOv3
Prefix-Tuning Optimizing Continuous Prompts for Generation
ImageNet Classification with Deep Convolutional Neural Networks
You Only Look Once, Unified Real-Time Object Detection
Attention Is All You Needs
EXAONE 4.0 Unified Large Language Models Integrating Non-reasoning and Reasoning Modes
군중 상황에서 정확한 다중 사람의 자세 인식을 위한 군중 자세 주석 데이터 세트