2010年12月12日星期日

Reading #30: Tahuti: A Geometrical Sketch Recognition System for UML Class Diagrams (Hammond)

Comments:
Wenzhe Li
Summary:
The paper presents a a dual-view, multi-stroke sketch recognition environment for class diagrams in UML, called Tahuti. It is a geometry-based method to give users more freedom to draw and edit.

The multi-layer framework includes processing, selection, recognition, identification. The main idea is to find any possible collection of strokes, then recognize and identify. In order to reduce the burden of grouping, the framework rules out a lot of impossible collections by setting the maximum number of strokes in a collection and setting some restriction on each stroke. 

Then the author introduces some recognition method for basic shape, like rectangle, ellipse.

The experiment shows that Tahuti is the most welcomed by users compared with other systems.
Discussion:
The idea of grouping is really good. It does not require users to draw an object in a specific manner. Also, the author find some methods to reduce the amount of collections, which allow the computer to run the program in real time. I see the same idea in the CivilSketch code.

However, in this paper, there is no accuracy about recognition rate. We do not know whether the system really works well. There should be a test on its geometric method.

Reading #29: Scratch Input Creating Large, Inexpensive, Unpowered and Mobile Finger Input Surfaces (Harrison)

Comments:
Chris
Summary:
The paper presents an acoustic-based finger input system, called Scratch Input, that can be used to create large, inexpensive and mobile finger input surfaces. The system is easy and convenient to carry, and can be used in desk, wall, mobile phone and etc. Only one microphone is used to record sounds.

The recognizer employed in the system is to recognize gestures by their sounds. Peak counts and amplitude is extracted from the sound of each gesture, and a shallow decision tree is used to make a decision. The system is tested by six gestures, and the accuracy is near 89%.
Discussion:
An interesting topic to recognize gestures by sounds. In the paper, authors give us several examples of the application of their system. However, I find the system seems too trivial. First, gestures are too different. Six gesture may be easily differentiated by peak counts due to different strokes. Second, the vocabulary is limited, because too many gestures have the same sound. Third, the order of strokes is very important to the system, and only specified order can be recognized. Finally, the system need a quiet environment. Though easy and convenient, the system really has many limitations.

Reading #28: iCanDraw? – Using Sketch Recognition and Corrective Feedback to Assist a User in Drawing Human Faces (Dixon)

Comments:
Wenzhe Li
Summary:
The paper presents a system, called iCanDraw to teach novice drawers how to draw a person's face. It can provide directions and feedbacks for users to help draw a face as accurately as possible. It is important, because users cannot always find a instructor to teach and correct them.

The system starts with a reference image. This image is recognized by a face recognition technique, and refined by authors in order to make the template as good as possible. The users look at the image, create  reference lines, and then draw a face. Users can check whether their drawing is consistent with the template at any time, and get corrective feedback.

User studies show the good performance of the system.
Discussion:
It is a great work to teach novice drawers how to draw a face when an instructor is not available. What I appreciate more is the section of corrective feedback. In my opinion, the most important part in human-computer interaction is the feedback. The system can really give a good feedback to assist drawers to draw a beautiful image as the user study says.

However, the system is only limited to drawing a face, due to the mature techniques on face recognition. It seems not easy to extend to other pictures. I have a idea whether can allow users draw reference line themselves? Draw in the template and display in both the template and the drawing area.

Reading #27: K-sketch: A 'Kinetic' Sketch Pad for Novice Animators (Davis)

Comments:
Sam
Summary:
The paper proposes a general-purpose, informal, 2D animation sketching system, called K-Sketch, to help novices create a wide range of animations quickly.

The system began with a lot of user studies, interview with animators and non-animators, which demonstrated the importance of designing a informal animation tool with little time to learn or use. 18 animation operations were proposed by those users. The goal of the system is not only fast, but also powerful. So the system selected 9 operations, which can meet most requirements of users. The system is implemented in C#.

The system is evaluated by three small user studies. All studies indicated that K-Sketch is stronger than PowerPoint in many aspects, except comfort sharing.
Discussion:
It is a good paper to help novice researcher, like me, to conduct a research on human-computer interaction based sketching system. First, conduct a study to know people's requirements. Second, design a system and make a trade-off between functions and computational time. Final, conduct user studies to compare the performance with other tools.

However, for this paper, I treat it more as a technical report, rather than a conference paper. I think for a conference paper, there should be some new ideas. But the paper includes more implementations rather than ideas.

2010年12月11日星期六

Reading #26: Picturephone: A Game for Sketch Data Capture (Johnson)

Comments:
Francisco
Summary:
The paper proposes a sketch-based game, Picturephone, for collecting data on how people amke and describe sketches. It is inspired by a children's game called Telephone. There are three modes, draw, describe and rate. Each user will be randomly assigned a mode to involve into the game. In draw mode, users will be asked to draw a sketch based on the description. In describe mode, users will be asked to give a description based on the sketch. In rate mode, users will be asked to giave a point to each pair of sketches.


Picturephone is a web-based application, using the standard HTTP protocol. The main adavantage is that it doesn't require all users play the game synchronously.
Discussion:
A good application of hand-drawn sketch. The idea of the game is really good. Compared to Stellasketch, I think the asynchronous game is better. It is really hard to let a lot users play a game at the same time unless the game has been as popular as Chess, Poker. To be honest, it is impossible for such a game.

Also, what I am concerned more is show in discussion of Reading 24, how to use these data. I think there are still a long way to develop a recognizer to use these sketches as examples. How to filter out dirty data, how to remove ambiguity and conflict is still a main topic for these games.

Reading #25: A Descriptor for Large Scale Image Retrieval Based on Sketched Feature Lines (Eitz)

Comments:
Chris
Summary:
The paper presents a tensor-based descriptor for large scale image retrieval based on sketched feature lines. The  descriptor is used to search an image in the database, which is similar to the input sketch. It solves the problem of asymmetry between the binary sketch input and the full color image.



The proposed tensor descriptor provides the information about the main orientation of the gradients in a cell. The descriptor is tested by a set of 1.5 million pictures related to outdoor sceneries. It performs comparably or slightly better than the MPEG-7 edge histogram descriptor variant. And it is easy to implement and efficient in evaluation.

Discussion:
It is a good idea to search an image from a database by an input sketch. Sketch based image retrieval is also another direction in the field of sketch. The descriptor proposed in the paper is simple to implement and better than another descriptor. However, there is no extra comparison between tensor and others, so I have no idea about the performance of the descriptor. And in the experiments, an input sketch can always find a lot of candidate pictures, some of which seems unrelated to the input. So there should be other descriptors to be added to make an efficient retrieval. Also, the descriptor has some limitation in transformation, which need improvement in future.

Reading #24: Games for Sketch Data Collection (Johnson)

Comments:
Kim
Summary:
The paper presents two games for sketch data collection. One is a asynchronous game called Picturephone, the other is a synchronous game called Stellasketch.  Both these two games are web-based, and need a lot of users to participate.

Picturephone collects long sentences that describe sketches. Each user is randomly assigned to one of three modes (draw, describe and rate). In draw mode, users draw a sketch based on the description. In describe mode, users describe the sketch. In rate mode, users give a score to each pair of sketches.

Stellasketch gathers short noun-phrase that label sketches as they are made. Each round, one user draw a sketch based on the nouns, and other users describe the sketch by noun-phrase. Each person doesn't know other's job.

Discussion:
These two games are only used to collect users' sketches. The ideas of these two games are really good, because it captures data when people are entertaining. They are a good application of human-computer interaction.

However, as the author says, there are still some difficulties on how to use these sketches. Some data is with much noise, some data is with extra strokes, and some data even is consisted of unrelated sketch. The quality of data cannot be guaranteed. Also, how to use these data to train recognizers is also very hard.

I appreciate the idea of collecting data by games, but I don't think there is much more adavantages than the normal method, such as user study.