Shumin Zhai

Shumin Zhai
Born (1961-04-01) April 1, 1961 (age 63)
Harbin, China
NationalityAmerican, Canadian
Alma materUniversity of Toronto
Known forKeyboard technology invention (ShapeWriter), the steering law, computer input methods, "Active Edge" for Google Pixel
AwardsACM Fellow, ACM CHI Academy, ACM User Interface and Software Technology Lasting Impact Award
Scientific career
FieldsHuman–computer interaction, interaction methods, human performance modeling
InstitutionsGoogle, IBM Almaden Research Center
Doctoral advisorPaul Milgram, Bill Buxton
Websiteshuminzhai.com

Shumin Zhai (Chinese simplified: 翟树民) (born 1961) is a Chinese-born American Canadian Human–computer interaction (HCI) research scientist and inventor.[citation needed] He is known for his research specifically on input devices and interaction methods, swipe-gesture-based touchscreen keyboards, eye-tracking interfaces, and models of human performance in human-computer interaction. His studies have contributed to both foundational models and understandings of HCI and practical user interface designs and flagship products. He previously worked at IBM where he invented the ShapeWriter text entry method for smartphones, which is a predecessor to the modern Swype keyboard.[1][2] Dr. Zhai's publications have won the ACM UIST Lasting Impact Award and the IEEE Computer Society Best Paper Award, among others, and he is most known for his research specifically on input devices and interaction methods, swipe-gesture-based touchscreen keyboards, eye-tracking interfaces, and models of human performance in human-computer interaction. Dr. Zhai is currently a principal scientist at Google where he leads and directs research, design, and development of human-device input methods and haptics systems.

Education

Born in Harbin, China in 1961, Dr. Zhai received his bachelor's degree in Electrical Engineering in 1982, and his master's degree in Computer Science in 1984 from Xidian University. After that, he served on the faculty of the Northwest Institute of Telecommunication Engineering (now Xidian University) in Xi'an, China where he taught and conducted research in computer control systems until 1989. In 1995, received his PhD degree in Human Factors Engineering at the University of Toronto.[3]

Career

From 2001 to 2007, Dr. Zhai was a visiting adjunct professor in the Department of Computer and Information Science (IDA) at Linköping University, where he also supervised graduate research.

He was a consultant at Autodesk in 1995 before joining IBM Almaden Research Center in 1996.

From 1996 to 2011, he worked at the IBM Almaden Research Center. In January 2007, he originated and led the SHARK/ShapeWriter project at IBM Research and a start-up company that pioneered the touchscreen word-gesture keyboard paradigm, filing the first patents of this paradigm, publishing the first generation of scientific papers.[4] In 2010, ShapeWriter was acquired by Nuance Communications, and taken off the market. During his tenure at IBM, Dr. Zhai also worked with a team of engineers from IBM and IBM vendors to bring the ScrollPoint mouse from research to market, and received a CES award and millions of users.

From 2009 to 2015, Dr. Zhai was also the editor-in-chief of the ACM Transactions on Computer-Human Interaction. At the time he had been deeply involved in both the conference side and the journal side of publishing HCI research as an author, reviewer, editor, committee member, and papers chair.[5][6]

From 2011 till present, Dr. Zhai has been working at Google as a principal scientist, where he leads and directs research, design, and development of human-device input methods and haptics systems. Specifically, he has led research and design of Google's keyboard products, Pixel phone haptics, and novel Google Assistant invocation methods. Notably, Dr. Zhai led the design of Active Edge, a headline feature of Google Pixel 2, which enables the user to reach Google Assistant faster and more intuitively using a gentle device squeeze rather than the touch screen.

Work

Dr. Zhai researches primarily in human-computer interaction, and is currently working on the research, design and development of manual and text input methods and haptics systems. Besides text input and haptics, his other research interests include system user interface design, human-performance modeling, multi-modal interaction, computer input devices and methods, and theories of human-computer interaction.[7] He has published over 200 research papers[8] and received 30 patents.[9]

Word-gesture keyboard

In 2003, Dr. Zhai and Per Ola Kristensson proposed a method of speed-writing for pen-based computing, SHARK (shorthand aided rapid keyboarding), which augments stylus keyboarding with shorthand gesturing. SHARK defines a shorthand symbol for each word according to its movement pattern on an optimized stylus keyboard.[10] In 2004, they presented SHARK2 that increased recognition accuracy and relaxed precision requirements by using the shape and location of gestures in addition to context based language models.[11] In doing so, Dr. Zhai and Kristensson delivered a paradigm of touch screen gesture typing[12] as an efficient method for text entry that has continued to drive the development of mobile text entry across the industry.[4] One of the most important rationales of gesture keyboards is facilitating transition from primarily visual-guidance drive letter-to-letter tracing to memory-recall driven gesturing.[13] By releasing the first word-gesture keyboard in 2004 through IBM AlphaWorks and a top ranked iPhone app called ShapeWriter WritingPad in 2008,[14] Dr. Zhai and his colleagues were able to facilitate this transition and brought the invention from the laboratory to real world users.[15]

Laws and models of action

One of Dr. Zhai's main HCI research threads is Fitts’ law type of human performance models. From 1996, Dr. Zhai, alongside his colleagues, has pursued research on “Laws of Action” that attempted to carry the spirit of Fitts' law forward. In the HCI context, Fitts' law can be considered the “Law of Pointing”, while they believe there are other robust human performance regularities in action. The two new classes of action relevant to user interface design and evaluation that they have explored are crossing and steering.[16]

  • “Law of Pointing”: Refining Fitts’ law models for bivariate pointing, 2003[17]
  • “Law of Steering”: Human Action Laws in Electronic Virtual Worlds - an empirical study of path steering performance in VR, 2004[18]
  • “Law of Crossing”: Foundations for designing and evaluating user interfaces based on the crossing paradigm, 2010[19]
  • Modeling human performance of pen stroke gestures, 2007[20]
  • FFitts' law: modeling finger touch with Fitts' law, 2013[21]
  • Modeling Gesture-Typing Movements, 2018[22]

Manipulation and navigation in 3D interfaces

Dr. Zhai started working on multiple degrees of freedom (DOF) input during his graduate years at the University of Toronto. In his Ph.D. thesis, he systematically examined human performance as a function of design variations of a 6 DOF control device, such as control resistance (isometric, elastic, and isotonic), transfer function (position vs. rate control), muscle groups used, and display format. He investigated people's ability to coordinate multiple degrees of freedom, based on three ways of quantification: simultaneous time-on-target, error correlation, and efficiency.

Eye-tracking augmented user interfaces

Dr. Zhai has been involved in two applications about eye-tracking augmented user interfaces, MAGIC pointing and RealTourist.[23]

In 1999, he worked together with his colleagues (Carlos Morimoto and Steven Ihde) at IBM Almaden Research Center and published a paper Manual and gaze input cascaded (MAGIC) pointing. This work explored a new direction in utilizing eye gaze for computer input, showing that the MAGIC pointing techniques might offer many advantages, including less physical effort and fatigue than traditional manual pointing, greater accuracy and naturalness than traditional gaze pointing, and possibly faster speed than manual pointing.[24]

In 2005, he developed and studied an experimental system, RealTourist, with Pernilla Qvarfordt and David Beymer. RealTourist lets a user to plan a conference trip with the help of a remote tourist consultant who could view the tourist's eye-gaze superimposed onto a shared map. Data collected from the experiment were analyzed in conjunction with literature review on speech and eye-gaze patterns. This inspective, exploratory research identified various functions of gaze-overlay on shared spatial material including: accurate and direct display of partner's eye-gaze, implicit deictic referencing, interest detection, common focus and topic switching, increased redundancy and ambiguity reduction, and an increase of assurance, confidence, and understanding. This study identified patterns that can serve as a basis for designing multimodal human-computer dialogue systems with eye-gaze locus as a contributing channel, and investigated how computer-mediated communication can be supported by the display of the partner's eye-gaze.[25]

FonePal

FonePal is a system developed to improve the experience of accessing call centers or help desks. Known as "touchtone hell", voice menu navigation has long been recognized as a frustrating user experience due to the nature of voice presentation. In contrast, FonePal allows a user to scan and select from a visual menu at the user's own pace, typically much faster than waiting for the voice menus to be spoken. FonePal uses the Internet infrastructure, specifically Instant Messaging, to deliver a visual menu on a nearby computer screen simultaneously with the voice menu over the phone.[26]

In 2005 and 2006, Dr. Zhai and his colleague Min Yin at IBM Almaden Research Center published two papers about this project. Their study shows that FonePal enables easier navigation of IVR phone tree, higher navigation speed, less routing error and greater satisfaction. FonePal can also seamlessly bridge the caller to a searchable web knowledge base, promoting relevant self-help and reducing call center operation cost.[27][28]

Awards and honors

Dr.Zhai is a Fellow of the Association for Computing Machinery (ACM) and a member of the CHI Academy. He has received many awards and honors. Among them:

  • IEEE Computer Society Best Paper Award
  • One of ACM's inaugural class of Distinguished Scientists (2006)
  • Member of the CHI Academy (2010)
  • Fellow of the ACM.[4] (2010)
  • ACM UIST Lasting Impact Award (2014)[4]

References

  1. ^ Zhai, Shumin; Kristensson, Per-Ola (2003). "Shorthand Writing on Stylus Keyboard". Proceedings of the conference on Human factors in computing systems - CHI '03. ACM. pp. 97–104. doi:10.1145/642611.642630. ISBN 1581136307. S2CID 1697605.
  2. ^ "Total recall boosts PDA writing". 15 August 2005. Retrieved 18 March 2019.
  3. ^ "1996-11-08 Zhai". hci.stanford.edu. Retrieved 2019-04-26.
  4. ^ a b c d "Googler Shumin Zhai awarded with the ACM UIST Lasting Impact Award". Google AI Blog. 3 November 2014. Retrieved 2019-04-27.
  5. ^ "ACM Transactions on Computer-Human Interaction". tochi.acm.org. Retrieved 2019-04-27.
  6. ^ Zhai, Shumin, ed. (December 2015). "TOCHI Editor-in-Chief Transition: Farewell from Shumin Zhai, Welcome Ken Hinckley". ACM Trans. Comput.-Hum. Interact. 22 (6): 27e:1–27e:5. doi:10.1145/2835174. ISSN 1073-0516.
  7. ^ Zhai, Shumin. "About Me". Shumin Zhai. Retrieved 27 April 2019.
  8. ^ "Shumin Zhai - Google Scholar Citations". scholar.google.com. Retrieved 2019-04-27.
  9. ^ Waterloo, E5-Engineering 5 3102 200 University Avenue West; Canada, ON N2L 3G1 (2018-04-09). "CBB Seminar: Dr. Shumin Zhai, Google Inc". Engineering. Retrieved 2019-04-27.{{cite web}}: CS1 maint: numeric names: authors list (link)
  10. ^ Zhai, Shumin; Kristensson, Per-Ola (2003). "Shorthand writing on stylus keyboard". Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI '03. New York, NY, USA: ACM. pp. 97–104. doi:10.1145/642611.642630. ISBN 9781581136302. S2CID 1697605.
  11. ^ Kristensson, Per Ola; Zhai, Shumin (2004). "SHARK2: a large vocabulary shorthand writing system for pen-based computers. In Proceedings of the 17th annual ACM symposium on User interface software and technology (UIST '04)". ACM: 43–52. doi:10.1145/1029632.1029640. ISBN 9781581139570. S2CID 3970190 – via ACM Digital Library.
  12. ^ US 7251367, Zhai, Shumin, "System and method for recognizing word patterns based on a virtual keyboard layout", published 2007-07-31, assigned to IBM 
  13. ^ Zhai, Shumin; Kristensson, Per Ola (2012). "The word-gesture keyboard: reimagining keyboard interaction". Communications of the ACM. 55 (9 (September 2012)). ACM: 91–101. doi:10.1145/2330667.2330689. S2CID 566903.
  14. ^ "WritingPad - Top iPhone Applications - Time". Time. December 21, 2008.
  15. ^ Zhai, Shumin (2009). "Shapewriter on the iphone: From the laboratory to the real world". CHI '09 Extended Abstracts on Human Factors in Computing Systems. CHI EA '09. ACM. pp. 2667–2670. doi:10.1145/1520340.1520380. ISBN 9781605582474. S2CID 12477412 – via ACM Digital Library.
  16. ^ "shuminzhai | Research Projects". Shumin Zhai |. 25 February 2018. Retrieved 2019-04-27.
  17. ^ Accot, Johnny; Zhai, Shumin (2003). "Refining Fitts' law models for bivariate pointing". Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI '03. New York, NY, USA: ACM. pp. 193–200. doi:10.1145/642611.642646. ISBN 9781581136302. S2CID 5154061.
  18. ^ Zhai, Shumin; Accot, Johnny; Woltjer, Rogier (2004-04-01). "Human Action Laws in Electronic Virtual Worlds: An Empirical Study of Path Steering Performance in VR". Presence: Teleoperators and Virtual Environments. 13 (2): 113–127. doi:10.1162/1054746041382393. ISSN 1054-7460. S2CID 36408015.
  19. ^ Apitz, Georg; Guimbretière, François; Zhai, Shumin (May 2008). "Foundations for Designing and Evaluating User Interfaces Based on the Crossing Paradigm". ACM Trans. Comput.-Hum. Interact. 17 (2): 9:1–9:42. doi:10.1145/1746259.1746263. ISSN 1073-0516. S2CID 6224916.
  20. ^ Cao, Xiang; Zhai, Shumin (2007). "Modeling human performance of pen stroke gestures". Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI '07. New York, NY, USA: ACM. pp. 1495–1504. doi:10.1145/1240624.1240850. ISBN 9781595935939. S2CID 6745302.
  21. ^ Bi, Xiaojun; Li, Yang; Zhai, Shumin (2013). "FFitts law". Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI '13. New York, NY, USA: ACM. pp. 1363–1372. doi:10.1145/2470654.2466180. ISBN 9781450318990. S2CID 2675893.
  22. ^ Quinn, Philip; Zhai, Shumin (2018-05-04). "Modeling Gesture-Typing Movements". Human–Computer Interaction. 33 (3): 234–280. doi:10.1080/07370024.2016.1215922. ISSN 0737-0024. S2CID 4571827.
  23. ^ Shumin Zhai. "What's in the Eyes For Attentive Input | March 2003 | Communications of the ACM". cacm.acm.org. Retrieved 2019-04-27.
  24. ^ Zhai, Shumin; Morimoto, Carlos; Ihde, Steven (1999). "Manual and gaze input cascaded (MAGIC) pointing". Proceedings of the SIGCHI conference on Human factors in computing systems the CHI is the limit - CHI '99. New York, NY, USA: ACM. pp. 246–253. doi:10.1145/302979.303053. ISBN 9780201485592. S2CID 207247711.
  25. ^ Qvarfordt, Pernilla; Beymer, David; Zhai, Shumin (2005). "RealTourist – A Study of Augmenting Human-Human and Human-Computer Dialogue with Eye-Gaze Overlay". In Costabile, Maria Francesca; Paternò, Fabio (eds.). Human-Computer Interaction - INTERACT 2005. Lecture Notes in Computer Science. Vol. 3585. Springer Berlin Heidelberg. pp. 767–780. doi:10.1007/11555261_61. ISBN 9783540317227.
  26. ^ "shuminzhai | Research Projects". Shumin Zhai |. 25 February 2018. Retrieved 2019-04-27.
  27. ^ Yin, Min; Zhai, Shumin (2005). "Dial and see". Proceedings of the 18th annual ACM symposium on User interface software and technology. UIST '05. New York, NY, USA: ACM. pp. 187–190. doi:10.1145/1095034.1095066. ISBN 9781595932716. S2CID 8403712.
  28. ^ Yin, Min; Zhai, Shumin (2006). "The benefits of augmenting telephone voice menu navigation with visual browsing and search". Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI '06. New York, NY, USA: ACM. pp. 319–328. doi:10.1145/1124772.1124821. ISBN 9781595933720. S2CID 16484512.