Indoor localization is of great importance to a wide range of applications in the era of mobile computing. Current mainstream solutions rely on Received Signal Strength (RSS) of wireless signals as fingerprints to distinguish and infer locations. However, those methods suffer from fingerprint ambiguity that roots in multipath fading and temporal dynamics of wireless signals. Though pioneer efforts have resorted to motion-assisted or peer-assisted localization, they neither work in real time nor work without the help of peer users, which introduces extra costs and constraints, and thus degrades their practicality. To get over these limitations, we propose Argus, an image-assisted localization system for mobile devices. The basic idea of Argus is to extract geometric constraints from crowdsourced photos, and to reduce fingerprint ambiguity by mapping the constraints jointly against the fingerprint space. We devise techniques for photo selection, geometric constraint extraction, joint location estimation, and build a prototype that runs on commodity phones. Extensive experiments show that Argus triples the localization accuracy of classic RSS-based method, in time no longer than normal WiFi scanning, with negligible energy consumption.