Posts
Papers
Python
Papers
Deep Delta Learning
Seedream 4.0 Toward Next-generation Multimodal Image Generation
Souper-Model How Simple Arithmetic Unlocks State-of-the-Art LLM Performance
Black-Box On-Policy Distillation of Large Language Models
Depth Anything 3 Recovering the Visual Space from Any Views
LeJEPA Provable and Scalable Self-Supervised Learning Without the Heuristics
Kosmos An AI Scientist for Autonomous Discovery
Context Engineering 2.0 - The Context of Context Engineering
Kimi Linear An Expressive, Efficient Attention Architecture
Emu3.5 Native Multimodal Models are World Learners
Exploring Conditions for Diffusion Models in Robotic Control
Real Deep Research for AI, Robotics and Beyond
A Survey of Data Agents Emerging Paradigm or Overstated Hype
The Free Transformer
A Definition of AGI
FineVision Open Data Is All You Need
DeepSeek-OCR Contexts Optical Compression
Detect Anything via Next Point Prediction
MCPMark A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use
The Dragon Hatchling The Missing Link between the Transformer and Models of the Brain
VFF-Net Evolving forward–forward algorithms into convolutional neural networks for enhanced computational insights
Diffusion Transformers with Representation Autoencoders
Training-Free Group Relative Policy Optimization
Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought
Meta-Awareness Enhances Reasoning Models Self-Alignment Reinforcement Learning
Agent Learning via Early Experience
FAST-DLLM V2 Efficient Block-Diffusion LLM
Less is More Recursive Reasoning with Tiny Networks
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation
CoDA Agentic Systems for Collaborative Data Visualization
Video models are zero-shot learners and reasoners
Soft Tokens, Hard Truths
Sharing is Caring Efficient LM Post-Training with Collective RL Experience Sharing
Why Language Models Hallucinate
Hunyuan3D Studio End-to-End AI Pipeline for Game-Ready 3D Asset Generation
DINOv3
Prefix-Tuning Optimizing Continuous Prompts for Generation
ImageNet Classification with Deep Convolutional Neural Networks
You Only Look Once, Unified Real-Time Object Detection
Attention Is All You Needs
EXAONE 4.0 Unified Large Language Models Integrating Non-reasoning and Reasoning Modes
군중 상황에서 정확한 다중 사람의 자세 인식을 위한 군중 자세 주석 데이터 세트