2024-09-25 |
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models |
Matt Deitke et.al. |
2409.17146v1 |
null |
2024-09-25 |
Attention Prompting on Image for Large Vision-Language Models |
Runpeng Yu et.al. |
2409.17143v1 |
link |
2024-09-25 |
Adaptive Cost Model for Query Optimization |
Nikita Vasilenko et.al. |
2409.17136v1 |
null |
2024-09-25 |
Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale |
Fan Zhou et.al. |
2409.17115v1 |
link |
2024-09-25 |
MorphoSeg: An Uncertainty-Aware Deep Learning Method for Biomedical Segmentation of Complex Cellular Morphologies |
Tianhao Zhang et.al. |
2409.17110v1 |
link |
2024-09-25 |
BitQ: Tailoring Block Floating Point Precision for Improved DNN Efficiency on Resource-Constrained Devices |
Yongqi Xu et.al. |
2409.17093v1 |
link |
2024-09-25 |
SEN12-WATER: A New Dataset for Hydrological Applications and its Benchmarking |
Luigi Russo et.al. |
2409.17087v1 |
null |
2024-09-25 |
Can Vision Language Models Learn from Visual Demonstrations of Ambiguous Spatial Reasoning? |
Bowen Zhao et.al. |
2409.17080v1 |
null |
2024-09-25 |
Efficient Feature Interactions with Transformers: Improving User Spending Propensity Predictions in Gaming |
Ved Prakash et.al. |
2409.17077v1 |
null |
2024-09-25 |
Benchmarking Domain Generalization Algorithms in Computational Pathology |
Neda Zamanitajeddin et.al. |
2409.17063v1 |
null |
2024-09-25 |
Omnibenchmark (alpha) for continuous and open benchmarking in bioinformatics |
Izaskun Mallona et.al. |
2409.17038v1 |
null |
2024-09-25 |
Enhanced Wavelet Scattering Network for image inpainting detection |
Barglazan Adrian-Alin et.al. |
2409.17023v1 |
null |
2024-09-25 |
PTQ4RIS: Post-Training Quantization for Referring Image Segmentation |
Xiaoyan Jiang et.al. |
2409.17020v1 |
link |
2024-09-25 |
Single Image, Any Face: Generalisable 3D Face Generation |
Wenqing Wang et.al. |
2409.16990v1 |
null |
2024-09-25 |
ABCFair: an Adaptable Benchmark approach for Comparing Fairness Methods |
MaryBeth Defrance et.al. |
2409.16965v1 |
null |
2024-09-25 |
RESAA: A Removal and Structural Analysis Attack Against Compound Logic Locking |
Felipe Almeida et.al. |
2409.16959v1 |
null |
2024-09-25 |
Informed deep hierarchical classification: a non-standard analysis inspired approach |
Lorenzo Fiaschi et.al. |
2409.16956v1 |
null |
2024-09-25 |
DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling |
Kyuheon Jung et.al. |
2409.16949v1 |
link |
2024-09-25 |
NTIRE 2024 Challenge on Stereo Image Super-Resolution: Methods and Results |
Longguang Wang et.al. |
2409.16947v1 |
null |
2024-09-25 |
One-body correlations and momentum distributions of trapped 1D Bose gases at finite temperature |
Attila Takács et.al. |
2409.16929v1 |
null |
2024-09-25 |
Game4Loc: A UAV Geo-Localization Benchmark from Game Data |
Yuxiang Ji et.al. |
2409.16925v1 |
link |
2024-09-25 |
Tell Me What You Don't Know: Enhancing Refusal Capabilities of Role-Playing Agents via Representation Space Analysis and Editing |
Wenhao Liu et.al. |
2409.16913v1 |
null |
2024-09-25 |
Tailored 3D microphantoms: an essential tool for quantitative phase tomography analysis of organoids |
Michal Ziemczonok et.al. |
2409.16888v1 |
null |
2024-09-25 |
The cumulant Green's functions method for the single impurity Anderson model |
T. M. Sobreira et.al. |
2409.16881v1 |
null |
2024-09-25 |
Robust Scene Change Detection Using Visual Foundation Models and Cross-Attention Mechanisms |
Chun-Jung Lin et.al. |
2409.16850v1 |
null |
2024-09-25 |
Exposing Assumptions in AI Benchmarks through Cognitive Modelling |
Jonathan H. Rystrøm et.al. |
2409.16849v1 |
null |
2024-09-25 |
CodeInsight: A Curated Dataset of Practical Coding Solutions from Stack Overflow |
Nathanaël Beau et.al. |
2409.16819v1 |
link |
2024-09-25 |
Benchmarking Deep Learning Models for Object Detection on Edge Computing Devices |
Daghash K. Alqahtani et.al. |
2409.16808v1 |
null |
2024-09-25 |
Mitigating the Bias of Large Language Model Evaluation |
Hongli Zhou et.al. |
2409.16788v1 |
null |
2024-09-25 |
MaViLS, a Benchmark Dataset for Video-to-Slide Alignment, Assessing Baseline Accuracy with a Multimodal Alignment Algorithm Leveraging Speech, OCR, and Visual Features |
Katharina Anderer et.al. |
2409.16765v1 |
link |