Multi-Modal Recommendation (MMRec) aims to help users explore their potential interested items based on multi-modal input and has been widely used in e-commerce platforms. Recent works mainly focus on modeling item-side information. However, they ignore the abundant semantic information from the user-side, including demographics, behavioral patterns, feedback, etc. Such imbalanced attention to item and user leads to inadequate expressiveness of comprehensive interests. In this paper, we propose a novel User-insight Multi-modal recommendation framework, termed UiM. This framework improves user modeling in three aspects: Firstly, we propose to construct an enriched user profile to re-distribute attention to users’ historical interactions, which helps better explore the primary interests from a large-scale item pool. Secondly, to further disentangle compact representations from heterogeneous items, we propose to apply an auto-regressive multi- interest extraction network on re-attentioned item representations after self-adaptively fusing multi-modal features. Moreover, an intrinsic shortage of a trivial recommender system is that it fails to access user feedback for in-place result adjustment. As a solution, we access pseudo feedback beforehand from an intelligent agent, then accordingly perform potential adjustments to recommendation candidates for finer results. Experiments show that UiM outperforms state-of-the-art models in three public datasets. Codes will be released upon acceptance.