Mobile Phone Usability and Cultural Dimensions: China, Germany
Transcrição
Mobile Phone Usability and Cultural Dimensions: China, Germany
Mobile Phone Usability and Cultural Dimensions: China, Germany & USA Tobias Komischke, April McGee, Ning Wang, Klaus Wissmann Siemens AG1 – Beijing, China; Munich, Germany; Princeton, NJ USA [email protected]; [email protected]; [email protected]; [email protected] Abstract It is well known that cultural differences influence the way in which people think and behave in many areas of work and home. This paper describes an exploratory study comparing mobile phone usability in China, Germany and the USA. The test design includes a newly developed questionnaire that, for the first time, links perceived product attributes to cultural dimensions. Conclusions focus not only on comparing ease-of-use and user satisfaction between countries, but also on explaining these issues against the background of cultural factors. Keywords: mobile phone usability, cultural dimensions, internationalization 1. Background Since the beginning of the 1990s, it has become essential for companies who are globally active and want to market their products internationally to address intercultural considerations in their product development processes. Empirical research into the comparability of different countries with reference to special cultural dimensions (e.g. into intercultural usability engineering) originated in the field of international business. In order to support foreign negotiations or the posting of staff to foreign countries, broad, quantitative research was conducted through questionnaires (Hofstede 1991; Trompenaars, 1993). Because intercultural dimensions were not explored primarily for the purpose of supporting product development, one major step was to establish the connection to graphical user interface design. This was especially addressed by Marcus (2001) who linked the Hofstedian cultural dimensions with characteristic factors in user interfaces (e.g. navigation, interaction, presentation). This link, originally conceived for web design, was operationalized more explicitly by Röse (2002) and defined as cultural user interface factors. Yet since there are still relatively few publications available that describe what issues are important when products are to be adapted to different markets, much insight can be gained by applied research. In this project we focused on comparing the use of a product in different countries and on mapping product characteristics such as usability and perceived impression to cultural dimensions. As a result, we hoped to reach an understanding regarding: 1 what differences and what similarities of usage can be found in the addressed countries the nature of links/relationships between products and cultures what conclusions can be drawn for the localization of a product We wish to thank Phil Arko, Nuray Aykin, Stefan Mark, Alex Sanchez and Kathleen Stahlberg for their assistance. 2. Method 2.1. Participants The participants in this study were 15 working professionals, from China, Germany and the USA, five from each country. Participants were matched as closely as possible to the target audience defined for the mobile phone: • People that have a high interest in design issues • 55% female, 45% male • Between 28 and 40 years old • Early career success • Willing to show their success Each country’s group consisted of three females and two males for a total of nine females and six males. Their ages ranged from 27 to 43 years. Participants were screened to qualify as early adopters open to new technologies. Each participant received a monetary incentive as a thank you token. 2.2. Measures Participants were asked to complete eight tasks using the mobile phone. The language of the prompts and text feedback on the phone matched the country where the test took place. Tasks were representative of typical mobile phone operations: 1. Briefly explore the phone and tell me your first impressions 2. Place a call to the following number: <phone number specified> 3. Call your friend <name specified >, whose number is in the address book 4. You have a new business contact and want to save his name and office phone number in the address book: <name + phone number specified> 5. Write your friend <name + phone number specified> a short message <message specified > 6. Set a new calendar appointment. 7. Disable the vibration mode for the phone’s ringer. 8. Set the alarm for 8 p.m. today. With the exception of the first task, the task order was randomized for each participant. The number of errors and help requests was measured. Test sessions lasted between 45 and 120 minutes. All sessions were videotaped. The 5 Usability Dimensions Attitude Scale (“5UD”) (ISO 9241, 1993) was administered after each task and at the end of the entire session to track and measure the attributes of efficiency, likeability, self-descriptiveness, controllability and learnability. Each was rated on a scale from 1 to 9, where 1 represents the least favorable and 9 is the most favorable. To achieve our goals we adopted the Hofstedian cultural factors, which can be described as separate instances in the following dimensions: Power Distance: The extent to which the less powerful members of institutes or organizations of a country expect and accept that power is distributed unequally. Individualism vs. Collectivism: Individualism reflects cultures in which the ties between individuals are loose. Collectivism implies that people are integrated into strong cohesive “in-groups” from birth onwards. Masculinity vs. Femininity: Masculinity defines societies where stark differences are drawn between male and female roles in society. In feminine cultures, there is greater ambiguity in the male and female roles. Uncertainty Avoidance: The extent to which people use formalities, punctuality and legal, religious and social requirements to avoid uncertainty and ambiguity. Long-term vs. short-term orientation: Long-term orientation is found in cultures that orient themselves towards future reward. Perseverance and thrift are valued. Cultures with short-term orientation promote the virtues of the past and the present. Hofstede reported values which normally range from 0 to 100 for each dimension. A higher number correlates with a stronger characteristic of the dimension within the respective country. Since Hofstede has no data available for China for the dimensions of Power Distance, Individualism, Masculinity and Uncertainty Avoidance, these scores were estimated using the average scores of the Asian cultures of Japan, Taiwan, South Korea and Hong Kong. To be able to link users’ perceived impression of the phone to cultural factors, a semantic differential scale was developed. For this, we used findings from Hassenzahl (2002) on perceived hedonic quality of products as well as from Marcus’ work mentioned above. The scale features 15 bipolar word pairs, three at a time can be attributed to one of Hofstede’s 5 cultural dimensions. The understandability of the word pairs in all three languages (USEnglish, German and Mandarin Chinese) were validated by pre-tests. Since this scale only marks a first step towards attributing product impression to cultural factors the scale was not statistically analyzed yet for its internal validity. The cultural dimensions data derived from the semantic differential (median) were normalized to fit Hofstede’s scores. Then the difference between the Hofstede scores and the current scores was calculated for each country and dimension. For the relationship between the findings from this semantic differential scale and Hofstede’s findings, we formulated the following two hypotheses: (1) For Germany, we expected to find approximately the same scores than Hofstede reported. (2) For China and the US, we expected that the scores from the semantic differential scale would deviate from the Hofstedian scores, leaning towards the scores which Hofstede reported for Germany. The reason for both hypotheses is that we assume a phone developed in Germany would reflect a “German” interaction philosophy. 2.3. Procedure The test design was planned and generated in cooperation between all three countries. Some local adaptation had to be done, e.g. with the definition of the screening criteria to select test persons. Due to the different time zones, this had to be performed asynchronously. The resulting test design then was carried out similarly across the countries. Participants were given a brief introduction to the session, then completed a demographic questionnaire. The researcher then handed the subject the phone and an index card with the first task. Participants were encouraged to think and talk out loud. After each task, participants completed a 5UD questionnaire for that task. If the participant became stuck or frustrated on a task or asked for help, the researcher provided assistance. If the participant gave up completely, the researcher proceeded to the next task. The researcher took notes using a standard paper protocol and videotaped each session. After the last task the 5 UD questionnaire for an overall usability rating was administered and a debriefing session obtained general impressions as well as likes and dislikes. The data from the tests were analyzed and prepared—first locally in the respective country, and then were later merged in the US to a central results document. Further communication between the three countries was necessary to clarify some issues and fine-tune the results documentation. 3. Results 3.1. Usability findings 3.1.1. Variations in Perception of Usability Task 4 indicates the first instance of variation in perception of usability. This task, which asked participants to save an entry to the phonebook, showed that Chinese participants found the phone to be unsympathetic to user errors and non-self descriptive when relying on the user interface for help. Task 5, writing a short message, further emphasises this departure. In this case, US participants gave negative ratings to factors of controllability (‘3’) and learnability (‘4’); Chinese participants gave a negative rating to controllability (‘4’); while German participants still continued to display a favorable regard for all 5 usability dimensions: efficiency (‘7’), likeability (‘7’), self-descriptiveness (‘8’), controllability (‘7’) and learnability (‘7,’). In task 6, setting a calendar appointment, we begin to see another similar accord in that all 3 countries had difficulties setting up an appointment. In fact, all 3 countries rated the phone neutral or worse with two exceptions – the Germans gave a likeability rating of ‘6’ and the Chinese gave a learnability rating of ‘6’. Chinese participants rated the phone’s level of selfdescriptiveness with a ‘2,’ which is unique as there were no other ratings lower than ‘4’ by any country for any of the other attitude measures. Task 7, disabling the vibration mode, yielded consistent voting behavior per country. For example, Chinese participants voted between a ‘7’ and ‘8’ across all 5 UD’s, while German ones voted between a ‘6’ and ‘7,’ and US participants a solid ‘5’ throughout, thereby exhibiting consistent voting behavior within each country across all 5 UD’s. Task 8, setting the alarm, in general, yielded positive results (a ‘5’ or better), except for the UD of controllability, where the Chinese gave it a ‘4.’ 3.1.2. Request for Assistance in Task Completion On more complex tasks users did request assistance. During task 4 only 1 US participant needed help, and only once; 3 Chinese participants needed assistance at least once, but no more than three times, in order to complete this task; while the Germans had 3 participants that needed help, but with no more than one question each. Task 5 gave 2 Chinese participants some difficulty, where a minimum of two and a maximum of 3 questions were asked. Several German participants needed help only once; and 1 US participant needed help once. Task 6 was the task where German participants requested assistance the most (3 participants with at least one question, but no more than two), while Chinese participants had fewer questions, but more usability problems. This is not an unusual finding given that in Chinese culture people are more reluctant to acknowledge a lack of competency than in Western cultures. 3.1.3. Distribution of errors Across all applied test tasks, the German users made fewest errors while the Chinese users made the most errors. Because it was a phone developed in Germany, the conclusion from an intercultural standpoint could be that the phone’s interaction philosophy complies with the expectation of the German test users. One of the reasons that Chinese users made more errors than the German and US test persons on all but one task could have been that the Mandarin Chinese translation of several terms was not optimal. Across all three countries, the fewest errors were made for the tasks of calling someone and changing settings (tasks 2,7,8). Tasks that involved typing in alphanumerical code produced increased error rates. One part of the problem was that most American and Chinese test users were unfamiliar with the T9 technology, which automatically completes a word and therefore may display other characters than the user actually selected. Because the phone’s default setting had T9 enabled, the American and Chinese users produced more errors. Since they could not use the phone long enough to learn about the advantages of T9, they felt a lack of controllability and therefore disapproved of the technology. 3.1.4. Results on impression The comparison of the results from the three countries show that for most word pairs from the semantic differential, they followed the same answer tendencies. Major deviations were found for the following questions. US and German test users perceived the phone as being geared towards experts, whereas the Chinese thought of it as being geared towards novices. The same pattern was observed for the word pair “ambiguous” versus “definite.” One interpretation could be that because US and German participants perceived the phone as being ambiguous, they also rated it as being geared towards experts. The only word pair where all three countries clearly had different opinions was between (a) “requires perseverance to reach results” and (b) “allows to achieve quick results.” While the Chinese test users indicated a neutral position on this issue, the German persons were rather of the opinion that (a) was true. The US test participants on the other side tended towards (b). The following section further analyzes the results from the semantic differential on the background of intercultural issues. 3.2. Cultural Dimensions Table 1: Cultural Dimension Scores China Germany USA Study Study Study Hofstede2 Findings Difference Hofstede Findings Difference Hofstede Findings Difference 1 PD 60 71 11 35 57 22 40 71 31 IDV 27 71 44 67 71 4 91 86 -5 MAS 59 29 -30 66 43 -23 62 57 -5 UAI 69 57 -12 65 57 -8 46 43 -3 LTO 100 57 -43 31 43 12 29 43 14 1 PD = Power Distance, IDV = Individualism, MAS = Masculinity, UAI = Uncertainty Avoidance, LTO = Longterm Orientation. A higher number (on a scale from 0 to 100) correlates with a stronger characteristic of the dimension within the respective country. 2 Estimated scores based upon the average of Hofstede’s scores of Japan, Taiwan, South Korea and Hong Kong 50 40 30 20 10 Difference China 0 Difference Germany Difference USA LT O U AI M AS -30 ID V -20 PD -10 -40 -50 Figure 1: Differences between Hofstede’s scores and score found in this study. Figure 1 shows the differences between the Hofstede scores and the scores from this study. Positive scores mean that our scores were higher than Hofstede’s whereas negative scores mean that our scores were lower than Hofstede’s scores. 3.2.1. Differences Within Countries With the exception of Long-term Orientation, all cultural dimension scores for China had to be estimated. The found scores clearly deviate from Hofstede’s score. For China, the two biggest difference values of all the three countries were found. The Individualism score is much higher than that found by Hofstede. This finding meets our second hypothesis that deviated scores lean towards the German score as stated by Hofstede. Since Hofstede states significantly higher Individualism scores for Germany, this result can be interpreted as reflecting a German cultural influence on the impression of the phone. Following the same line of argumentation, the same holds true for Long-term Orientation. The score that was found is much lower than Hofstede’s score, thus indicating that the phone was perceived as being un-Asian or geared more towards quick insight and immediate success. The Masculinity score was also much lower than Hofstede’s score. Since Germany has a higher Hofstede score on this dimension than China, this result cannot be explained in the same way as for Individualism and Long-term Orientation and therefore does not support our hypothesis number 2. However the phone was geared more towards women than men, which the masculinity score seems to reflect. So we assume that this end customer focus overlays the effects that we predicted according to our second hypothesis in terms of the cultural dimension of Masculinity. For Germany, we expected to find approximately the same scores as Hofstede (hypothesis number one) – because it is a product that was developed in Germany and therefore should reflect German cultural influence. As the results show, this hypothesis cannot be confirmed. Only Individualism and Uncertainty Avoidance deviated less than 10 score points from Hofstede’s findings. Power Distance was rated clearly higher, meaning that the German test users were of the opinion that the phone was acting more formal than they would have expected. Long-term Orientation also scored higher, which could lead to the conclusion that the phone was more than was expected to be perceived as being geared towards persons who would be interested in using and understanding it in detail without much time pressure. Similar to the findings for China, the Masculinity score was lower than expected. The same rationale can be used here: the phone was geared towards women. In sum the US scores deviated the least from Hofstede’s scores. Very low differences were found for Individualism, Masculinity and Uncertainty Avoidance. Consequently the phone meets the US test persons’ expectations on these dimensions. The focus on women does not seem to have a significant effect. Power Distance is scored significantly higher than reported by Hofstede and – because Hofstede states lower Power Distance scores for Germany therefore our second hypothesis could not be confirmed. However the finding is in line with the scores given by the German test users. They also perceived the phone as being formal and hard to approach. Long-term Orientation is higher than the Hofstede scores. This follows our second hypothesis even though Germany’s scores as being reported by Hofstede are only slightly higher than the US scores. This finding also matches with the German test user scores, leading to the same interpretation that the culture inherent in the phone is more than was expected to be perceived as being geared towards persons who would be interested in using and understanding it in detail without much time pressure. 3.2.2. Differences Between Countries It is interesting to note cultural similarities in the findings between the countries. As mentioned above, Germany and the US show similar scores for Long-term Orientation. That there is hardly any difference between the two countries corresponds with Hofstede’s findings and reflects the fact that both countries are typical examples of Western culture. Germany and China show similar scores on the Uncertainty Dimension. Hofstede also reports almost similar values for these countries on this dimension - even though on a higher level. As opposed to the US culture, the findings confirm that people in Germany and China have a lower tolerance for unstructured situations and deviant ideas or behaviors. China and Germany show similar scores for Individualism, while the Hofstede scores are quite different (Germany being much higher). As mentioned above, the Chinese scores seem to have adapted to the German scores. China and the USA have similar scores on the culture dimension Power Distance, yet the Hofstede scores are different for these two countries. The test persons from both countries agree that the cultural influence shaped the phone towards being formal and hard to approach. This result differs from the American perspective more than from the Chinese. 4. Conclusion and outlook Due to the small number of participants the results should not be over-generalized; however, they do show similarities and differences between the countries. Especially interesting is that all modules that were used to assess user feedback on the product showed culture-related results. This will allow us to analyze cultural similarities and differences more in depth than could be achieved in this study. Finding differences in usage, impression and rating of a product confirms not only earlier research findings, but also the strategy of global companies to really draw the consequence to have local usability efforts. We also experienced the consequence that this decentralized work results in a higher communication and synchronization effort. However because this study was to a large extent detached from the development process these issues were not a significant hindrance. The newly developed semantic differential scale provided some interesting results. It proved to be of value to assess general impressions about the phone. Independent from that it could additionally be used as a tool that allows an intercultural interpretation of the responses. To advance from the exploratory nature of the scale towards a scientific tool, the internal validity needs to be further addressed in next research steps. Due to time constraints much of the study data could not be analyzed. This will be carried out together with applying the methodology and tools to other product domains. 5. References Hassenzahl, M. (2002). The effect of perceived hedonic quality on product appealingness. International Journal of Human-Computer Interaction, 13, pp. 479-497. Hofstede, G. (1981). Culture's consequences: International differences in work-related values. Newbury Park, CA: Sage. Hofstede, G. (1991). Cultures and Organizations: Software of the Mind: Intercultural Cooperation and its Importance for Survival. New York: McGraw-Hill. ISO 9241-11 (Issue 1993). Ergonomic requirements for office work with visual display terminals (VDTs) - Guidance on usability. Berlin: Beuth. Marcus, A. (2001). Cultural Dimensions and Global Web http://www.amanda.com/resources/hfweb2000/AMA_CultDim.pdf Design, AM+A, Röse, K. (2002). Models of culture and their applicability for designing user interfaces. In: Luczak, H.; Cakir, A.: WWDU 2002, Work With Display Units, World Wide Work. Proceedings of the 6th International Scientific Conference on Work with Display Units, Berchtesgaden, May 22-25, 2002, pp. 319-321. Trompenaars, F. (1993). Riding the waves of culture: understanding cultural diversity in business. London: The Economist Press.