In the conventional systems, a human must have knowledge of machines and of their special language in communicating with machines. In one side, it is desirable for a human but in another side, it is true that achieving it is very elaborate and is also a significant cause of human error. To reduce this sort of human load, an intelligent man-machine interface is desirable to exist between a human operator and machines to be operated. In the ordinary human communication, not only linguistic information but also visual information is effective, compensating for each others defect. From this viewpoint, problem of translating verbal expressions to some visual image is discussed here in this paper. The location relation between any two objects in a visual scene is a key in translating verbal information to visual information, as is the case in Fig.l. The present translation system advances in knowledge with experience. It consists of Japanese Language processing, image processing, and Japanese-scene translation functions.