FrontierPapersDaily AI Research from Top Labs
Topic
5289 papers
5289 papers

Today

VisionHugging Face
Just now

A new method for continuous, parametric decomposition of time-series data

Functional Continuous Decomposition

Researchers developed Functional Continuous Decomposition (FCD), a JAX-accelerated framework that uses Levenberg-Marquardt optimization to perform C^1 continuous, parametric fitting of time-series data into multiple temporal modes, improving analysis and feature extraction for machine learning models.

Source
LLMsHugging Face
1h ago

Generating vector fonts from text or images using a multimodal language model

VecGlypher: Unified Vector Glyph Generation with Language Models

VecGlypher is a multimodal language model that directly generates high-fidelity vector glyphs (font characters) from text descriptions or image examples by autoregressively emitting SVG path tokens, trained on a large dataset of fonts in a two-stage process.

Source
VisionHugging Face
3h ago

SeaCache: Speeding up diffusion models by caching based on image frequency changes

SeaCache: Spectral-Evolution-Aware Cache for Accelerating Diffusion Models

This paper introduces SeaCache, a training-free caching strategy for diffusion models that accelerates inference by reusing intermediate outputs based on a novel spectral-evolution-aware filter, which disentangles content from noise by focusing on low-frequency structures early and high-frequency details later.

Source
VisionHugging Face
3h ago

Generating images efficiently by mapping them onto a sphere

Image Generation with a Sphere Encoder

Researchers developed the Sphere Encoder, a generative model that maps images to a spherical latent space and can generate high-quality images in a single pass, rivaling multi-step diffusion models with significantly fewer computational steps.

Source
LLMsHugging Face
4h ago

Benchmarking LLM knowledge sources using models with open training data

NanoKnow: How to Know What Your Language Model Knows

Researchers created a benchmark, NanoKnow, using LLMs with fully open pre-training data to analyze how pre-training data frequency and external evidence influence model accuracy and knowledge utilization.

Source
AgentsHugging Face
5h ago

Improving Native GUI Agents with Action-Aware Training and Robust RL

GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL

This paper introduces GUI-Libra, a training recipe that uses action-aware supervised fine-tuning and a KL-regularized reinforcement learning approach to improve native GUI agents' reasoning and task completion by addressing data scarcity and partial verifiability issues.

Source
LLMsarXiv
5h ago

Benchmarking LLMs for Safe Planning in Partially Observable, Constrained Environments

SPOC: Safety-Aware Planning Under Partial Observability And Physical Constraints

This paper introduces SPOC, a new benchmark to evaluate large language models' ability to plan safely in real-world-like environments with partial information and physical limitations, revealing current models struggle with implicit safety constraints.

Source
SafetyarXiv
5h ago

Dynamically tracking and correcting analog memory device bias during AI model training

Dynamic Symmetric Point Tracking: Tackling Non-ideal Reference in Analog In-memory Training

This paper introduces a dynamic method to estimate and correct for the 'symmetric point' bias in analog in-memory computing devices during model training, improving accuracy without costly pre-calibration.

Source
VisionarXiv
5h ago

Improving face verification for identical twins using multi-scale and asymmetry analysis

AHAN: Asymmetric Hierarchical Attention Network for Identical Twin Face Verification

This paper introduces AHAN, a neural network that uses hierarchical attention on facial regions and cross-attention between face halves to better distinguish identical twins, achieving 92.3% accuracy.

Source
VisionarXiv
5h ago

Modeling network communities with two levels of convex hulls for explainable predictions

Archetypal Graph Generative Models: Explainable and Identifiable Communities via Anchor-Dominant Convex Hulls

This paper introduces GraphHull, a generative model that uses global archetypes and local prototypes, represented as convex hulls, to explain community structures and improve performance in tasks like link prediction and community detection.

Source
VisionarXiv
5h ago

Editing 3D models quickly and consistently from a single view

Easy3E: Feed-Forward 3D Asset Editing via Rectified Voxel Flow

This paper introduces a feed-forward framework that uses a voxel flow-based deformation and a normal-guided appearance prior to enable fast, consistent, and high-fidelity 3D model editing from a single input view.

Source
LLMsarXiv
5h ago

Generating private synthetic data efficiently using LLMs and dataset vectors

EPSVec: Efficient and Private Synthetic Data Generation via Dataset Vectors

This paper introduces EPSVec, a method that uses 'dataset vectors' to steer large language models for efficient and private synthetic data generation, decoupling privacy costs from generation and improving fidelity in low-data scenarios.

Source
VisionarXiv
5h ago

AngelSlim: A toolkit for compressing large AI models for faster, cheaper use

AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compression

AngelSlim is a toolkit that combines various techniques like quantization, speculative decoding, and token pruning to make large AI models smaller and faster for industrial deployment, including the first viable 2-bit large model.

Source
VisionarXiv
5h ago

Medical AI models change answers to rephrased questions, often ignoring images.

PSF-Med: Measuring and Explaining Paraphrase Sensitivity in Medical Vision Language Models

Researchers developed a benchmark to measure how medical vision-language models flip their 'yes/no' answers to rephrased questions, finding high flip rates and that some models maintain consistency even without an image, indicating reliance on language priors rather than visual grounding.

Source
SafetyarXiv
5h ago

Generating Realistic Surgical Videos, Including Rare Events, for AI Training

Towards Controllable Video Synthesis of Routine and Rare OR Events

Researchers developed a video diffusion framework that transforms operating room scenes into abstract geometric representations to synthesize controllable, realistic surgical videos, including rare safety-critical events, outperforming standard video generation methods and enabling training of AI models for near-miss detection.

Source
VisionarXiv
5h ago

Recognizing walking modes directly on a sensor for better exoskeleton control

Event-Driven On-Sensor Locomotion Mode Recognition Using a Shank-Mounted IMU with Embedded Machine Learning for Exoskeleton Control

This paper demonstrates an activity recognition system that classifies locomotion modes (stance, walking, stairs) directly within a shank-mounted IMU, reducing power consumption and latency for exoskeleton control by only sending activity labels to the microcontroller.

Source
AgentsarXiv
5h ago

Making Agentic Reinforcement Learning Stable and Reproducible for Complex Tasks

ARLArena: A Unified Framework for Stable Agentic Reinforcement Learning

This paper introduces ARLArena, a framework to analyze and stabilize agentic reinforcement learning by dissecting policy gradient methods into four dimensions, leading to SAMPO, a new optimization method that achieves consistent stability and strong performance.

Source
LLMsarXiv
5h ago

Improving LLM safety by teaching models to reason about harmful requests

Alignment-Weighted DPO: A principled reasoning approach to improve safety alignment

This paper introduces Alignment-Weighted DPO, a method that uses a Chain-of-Thought fine-tuning dataset with step-by-step rationales to teach LLMs to produce principled safety refusals by assigning different preference weights to reasoning and final-answer segments during optimization.

Source

FrontierPapers — Daily AI research from the world's top labs