Mobile Phone Usability and Cultural Dimensions: China, Germany

Transcrição

Mobile Phone Usability and Cultural Dimensions: China, Germany
Mobile Phone Usability and Cultural
Dimensions: China, Germany & USA
Tobias Komischke, April McGee, Ning Wang, Klaus Wissmann
Siemens AG1 – Beijing, China; Munich, Germany; Princeton, NJ USA
[email protected]; [email protected]; [email protected];
[email protected]
Abstract
It is well known that cultural differences influence the way in which people think and behave
in many areas of work and home. This paper describes an exploratory study comparing mobile
phone usability in China, Germany and the USA. The test design includes a newly developed
questionnaire that, for the first time, links perceived product attributes to cultural dimensions.
Conclusions focus not only on comparing ease-of-use and user satisfaction between countries,
but also on explaining these issues against the background of cultural factors.
Keywords: mobile phone usability, cultural dimensions, internationalization
1. Background
Since the beginning of the 1990s, it has become essential for companies who are globally
active and want to market their products internationally to address intercultural considerations
in their product development processes.
Empirical research into the comparability of different countries with reference to special
cultural dimensions (e.g. into intercultural usability engineering) originated in the field of
international business. In order to support foreign negotiations or the posting of staff to
foreign countries, broad, quantitative research was conducted through questionnaires
(Hofstede 1991; Trompenaars, 1993). Because intercultural dimensions were not explored
primarily for the purpose of supporting product development, one major step was to establish
the connection to graphical user interface design. This was especially addressed by Marcus
(2001) who linked the Hofstedian cultural dimensions with characteristic factors in user
interfaces (e.g. navigation, interaction, presentation). This link, originally conceived for web
design, was operationalized more explicitly by Röse (2002) and defined as cultural user
interface factors.
Yet since there are still relatively few publications available that describe what issues are
important when products are to be adapted to different markets, much insight can be gained
by applied research. In this project we focused on comparing the use of a product in different
countries and on mapping product characteristics such as usability and perceived impression
to cultural dimensions. As a result, we hoped to reach an understanding regarding:
1
what differences and what similarities of usage can be found in the addressed countries
the nature of links/relationships between products and cultures
what conclusions can be drawn for the localization of a product
We wish to thank Phil Arko, Nuray Aykin, Stefan Mark, Alex Sanchez and Kathleen Stahlberg for their assistance.
2. Method
2.1. Participants
The participants in this study were 15 working professionals, from China, Germany and the
USA, five from each country. Participants were matched as closely as possible to the target
audience defined for the mobile phone:
• People that have a high interest in design issues
• 55% female, 45% male
• Between 28 and 40 years old
• Early career success
• Willing to show their success
Each country’s group consisted of three females and two males for a total of nine females and
six males. Their ages ranged from 27 to 43 years. Participants were screened to qualify as
early adopters open to new technologies. Each participant received a monetary incentive as a
thank you token.
2.2. Measures
Participants were asked to complete eight tasks using the mobile phone. The language of the
prompts and text feedback on the phone matched the country where the test took place. Tasks
were representative of typical mobile phone operations:
1. Briefly explore the phone and tell me your first impressions
2. Place a call to the following number: <phone number specified>
3. Call your friend <name specified >, whose number is in the address book
4. You have a new business contact and want to save his name and office phone number
in the address book: <name + phone number specified>
5. Write your friend <name + phone number specified> a short message <message
specified >
6. Set a new calendar appointment.
7. Disable the vibration mode for the phone’s ringer.
8. Set the alarm for 8 p.m. today.
With the exception of the first task, the task order was randomized for each participant. The
number of errors and help requests was measured. Test sessions lasted between 45 and 120
minutes. All sessions were videotaped.
The 5 Usability Dimensions Attitude Scale (“5UD”) (ISO 9241, 1993) was administered after
each task and at the end of the entire session to track and measure the attributes of efficiency,
likeability, self-descriptiveness, controllability and learnability. Each was rated on a scale
from 1 to 9, where 1 represents the least favorable and 9 is the most favorable.
To achieve our goals we adopted the Hofstedian cultural factors, which can be described as
separate instances in the following dimensions:
Power Distance: The extent to which the less powerful members of institutes or
organizations of a country expect and accept that power is distributed unequally.
Individualism vs. Collectivism: Individualism reflects cultures in which the ties between
individuals are loose. Collectivism implies that people are integrated into strong cohesive
“in-groups” from birth onwards.
Masculinity vs. Femininity: Masculinity defines societies where stark differences are
drawn between male and female roles in society. In feminine cultures, there is greater
ambiguity in the male and female roles.
Uncertainty Avoidance: The extent to which people use formalities, punctuality and
legal, religious and social requirements to avoid uncertainty and ambiguity.
Long-term vs. short-term orientation: Long-term orientation is found in cultures that
orient themselves towards future reward. Perseverance and thrift are valued. Cultures with
short-term orientation promote the virtues of the past and the present.
Hofstede reported values which normally range from 0 to 100 for each dimension. A higher
number correlates with a stronger characteristic of the dimension within the respective
country. Since Hofstede has no data available for China for the dimensions of Power
Distance, Individualism, Masculinity and Uncertainty Avoidance, these scores were estimated
using the average scores of the Asian cultures of Japan, Taiwan, South Korea and Hong Kong.
To be able to link users’ perceived impression of the phone to cultural factors, a semantic
differential scale was developed. For this, we used findings from Hassenzahl (2002) on
perceived hedonic quality of products as well as from Marcus’ work mentioned above. The
scale features 15 bipolar word pairs, three at a time can be attributed to one of Hofstede’s 5
cultural dimensions. The understandability of the word pairs in all three languages (USEnglish, German and Mandarin Chinese) were validated by pre-tests. Since this scale only
marks a first step towards attributing product impression to cultural factors the scale was not
statistically analyzed yet for its internal validity.
The cultural dimensions data derived from the semantic differential (median) were normalized
to fit Hofstede’s scores. Then the difference between the Hofstede scores and the current
scores was calculated for each country and dimension.
For the relationship between the findings from this semantic differential scale and Hofstede’s
findings, we formulated the following two hypotheses:
(1) For Germany, we expected to find approximately the same scores than Hofstede
reported.
(2) For China and the US, we expected that the scores from the semantic differential scale
would deviate from the Hofstedian scores, leaning towards the scores which Hofstede
reported for Germany.
The reason for both hypotheses is that we assume a phone developed in Germany would
reflect a “German” interaction philosophy.
2.3. Procedure
The test design was planned and generated in cooperation between all three countries. Some
local adaptation had to be done, e.g. with the definition of the screening criteria to select test
persons. Due to the different time zones, this had to be performed asynchronously. The
resulting test design then was carried out similarly across the countries.
Participants were given a brief introduction to the session, then completed a demographic
questionnaire. The researcher then handed the subject the phone and an index card with the
first task. Participants were encouraged to think and talk out loud. After each task, participants
completed a 5UD questionnaire for that task. If the participant became stuck or frustrated on a
task or asked for help, the researcher provided assistance. If the participant gave up
completely, the researcher proceeded to the next task. The researcher took notes using a
standard paper protocol and videotaped each session. After the last task the 5 UD
questionnaire for an overall usability rating was administered and a debriefing session
obtained general impressions as well as likes and dislikes.
The data from the tests were analyzed and prepared—first locally in the respective country,
and then were later merged in the US to a central results document. Further communication
between the three countries was necessary to clarify some issues and fine-tune the results
documentation.
3. Results
3.1.
Usability findings
3.1.1. Variations in Perception of Usability
Task 4 indicates the first instance of variation in perception of usability. This task, which
asked participants to save an entry to the phonebook, showed that Chinese participants found
the phone to be unsympathetic to user errors and non-self descriptive when relying on the user
interface for help.
Task 5, writing a short message, further emphasises this departure. In this case, US
participants gave negative ratings to factors of controllability (‘3’) and learnability (‘4’);
Chinese participants gave a negative rating to controllability (‘4’); while German participants
still continued to display a favorable regard for all 5 usability dimensions: efficiency (‘7’),
likeability (‘7’), self-descriptiveness (‘8’), controllability (‘7’) and learnability (‘7,’).
In task 6, setting a calendar appointment, we begin to see another similar accord in that all 3
countries had difficulties setting up an appointment. In fact, all 3 countries rated the phone
neutral or worse with two exceptions – the Germans gave a likeability rating of ‘6’ and the
Chinese gave a learnability rating of ‘6’. Chinese participants rated the phone’s level of selfdescriptiveness with a ‘2,’ which is unique as there were no other ratings lower than ‘4’ by
any country for any of the other attitude measures.
Task 7, disabling the vibration mode, yielded consistent voting behavior per country. For
example, Chinese participants voted between a ‘7’ and ‘8’ across all 5 UD’s, while German
ones voted between a ‘6’ and ‘7,’ and US participants a solid ‘5’ throughout, thereby
exhibiting consistent voting behavior within each country across all 5 UD’s. Task 8, setting
the alarm, in general, yielded positive results (a ‘5’ or better), except for the UD of
controllability, where the Chinese gave it a ‘4.’
3.1.2. Request for Assistance in Task Completion
On more complex tasks users did request assistance. During task 4 only 1 US participant
needed help, and only once; 3 Chinese participants needed assistance at least once, but no
more than three times, in order to complete this task; while the Germans had 3 participants
that needed help, but with no more than one question each.
Task 5 gave 2 Chinese participants some difficulty, where a minimum of two and a maximum
of 3 questions were asked. Several German participants needed help only once; and 1 US
participant needed help once.
Task 6 was the task where German participants requested assistance the most (3 participants
with at least one question, but no more than two), while Chinese participants had fewer
questions, but more usability problems. This is not an unusual finding given that in Chinese
culture people are more reluctant to acknowledge a lack of competency than in Western
cultures.
3.1.3. Distribution of errors
Across all applied test tasks, the German users made fewest errors while the Chinese users
made the most errors. Because it was a phone developed in Germany, the conclusion from an
intercultural standpoint could be that the phone’s interaction philosophy complies with the
expectation of the German test users. One of the reasons that Chinese users made more errors
than the German and US test persons on all but one task could have been that the Mandarin
Chinese translation of several terms was not optimal.
Across all three countries, the fewest errors were made for the tasks of calling someone and
changing settings (tasks 2,7,8). Tasks that involved typing in alphanumerical code produced
increased error rates. One part of the problem was that most American and Chinese test users
were unfamiliar with the T9 technology, which automatically completes a word and therefore
may display other characters than the user actually selected. Because the phone’s default
setting had T9 enabled, the American and Chinese users produced more errors. Since they
could not use the phone long enough to learn about the advantages of T9, they felt a lack of
controllability and therefore disapproved of the technology.
3.1.4. Results on impression
The comparison of the results from the three countries show that for most word pairs from the
semantic differential, they followed the same answer tendencies. Major deviations were found
for the following questions. US and German test users perceived the phone as being geared
towards experts, whereas the Chinese thought of it as being geared towards novices. The same
pattern was observed for the word pair “ambiguous” versus “definite.” One interpretation
could be that because US and German participants perceived the phone as being ambiguous,
they also rated it as being geared towards experts. The only word pair where all three
countries clearly had different opinions was between (a) “requires perseverance to reach
results” and (b) “allows to achieve quick results.” While the Chinese test users indicated a
neutral position on this issue, the German persons were rather of the opinion that (a) was true.
The US test participants on the other side tended towards (b).
The following section further analyzes the results from the semantic differential on the
background of intercultural issues.
3.2.
Cultural Dimensions
Table 1: Cultural Dimension Scores
China
Germany
USA
Study
Study
Study
Hofstede2 Findings Difference Hofstede Findings Difference Hofstede Findings Difference
1
PD
60
71
11
35
57
22
40
71
31
IDV
27
71
44
67
71
4
91
86
-5
MAS
59
29
-30
66
43
-23
62
57
-5
UAI
69
57
-12
65
57
-8
46
43
-3
LTO
100
57
-43
31
43
12
29
43
14
1
PD = Power Distance, IDV = Individualism, MAS = Masculinity, UAI = Uncertainty Avoidance, LTO = Longterm Orientation. A higher number (on a scale from 0 to 100) correlates with a stronger characteristic of the
dimension within the respective country.
2
Estimated scores based upon the average of Hofstede’s scores of Japan, Taiwan, South Korea and Hong Kong
50
40
30
20
10
Difference China
0
Difference Germany
Difference USA
LT
O
U
AI
M
AS
-30
ID
V
-20
PD
-10
-40
-50
Figure 1: Differences between Hofstede’s scores and score found in this study.
Figure 1 shows the differences between the Hofstede scores and the scores from this study.
Positive scores mean that our scores were higher than Hofstede’s whereas negative scores
mean that our scores were lower than Hofstede’s scores.
3.2.1. Differences Within Countries
With the exception of Long-term Orientation, all cultural dimension scores for China had to
be estimated. The found scores clearly deviate from Hofstede’s score. For China, the two
biggest difference values of all the three countries were found. The Individualism score is
much higher than that found by Hofstede. This finding meets our second hypothesis that
deviated scores lean towards the German score as stated by Hofstede. Since Hofstede states
significantly higher Individualism scores for Germany, this result can be interpreted as
reflecting a German cultural influence on the impression of the phone. Following the same
line of argumentation, the same holds true for Long-term Orientation. The score that was
found is much lower than Hofstede’s score, thus indicating that the phone was perceived as
being un-Asian or geared more towards quick insight and immediate success. The Masculinity
score was also much lower than Hofstede’s score. Since Germany has a higher Hofstede score
on this dimension than China, this result cannot be explained in the same way as for
Individualism and Long-term Orientation and therefore does not support our hypothesis
number 2. However the phone was geared more towards women than men, which the
masculinity score seems to reflect. So we assume that this end customer focus overlays the
effects that we predicted according to our second hypothesis in terms of the cultural
dimension of Masculinity.
For Germany, we expected to find approximately the same scores as Hofstede (hypothesis
number one) – because it is a product that was developed in Germany and therefore should
reflect German cultural influence. As the results show, this hypothesis cannot be confirmed.
Only Individualism and Uncertainty Avoidance deviated less than 10 score points from
Hofstede’s findings. Power Distance was rated clearly higher, meaning that the German test
users were of the opinion that the phone was acting more formal than they would have
expected. Long-term Orientation also scored higher, which could lead to the conclusion that
the phone was more than was expected to be perceived as being geared towards persons who
would be interested in using and understanding it in detail without much time pressure.
Similar to the findings for China, the Masculinity score was lower than expected. The same
rationale can be used here: the phone was geared towards women.
In sum the US scores deviated the least from Hofstede’s scores. Very low differences were
found for Individualism, Masculinity and Uncertainty Avoidance. Consequently the phone
meets the US test persons’ expectations on these dimensions. The focus on women does not
seem to have a significant effect. Power Distance is scored significantly higher than reported
by Hofstede and – because Hofstede states lower Power Distance scores for Germany therefore our second hypothesis could not be confirmed. However the finding is in line with
the scores given by the German test users. They also perceived the phone as being formal and
hard to approach. Long-term Orientation is higher than the Hofstede scores. This follows our
second hypothesis even though Germany’s scores as being reported by Hofstede are only
slightly higher than the US scores. This finding also matches with the German test user scores,
leading to the same interpretation that the culture inherent in the phone is more than was
expected to be perceived as being geared towards persons who would be interested in using
and understanding it in detail without much time pressure.
3.2.2. Differences Between Countries
It is interesting to note cultural similarities in the findings between the countries. As
mentioned above, Germany and the US show similar scores for Long-term Orientation. That
there is hardly any difference between the two countries corresponds with Hofstede’s findings
and reflects the fact that both countries are typical examples of Western culture.
Germany and China show similar scores on the Uncertainty Dimension. Hofstede also reports
almost similar values for these countries on this dimension - even though on a higher level. As
opposed to the US culture, the findings confirm that people in Germany and China have a
lower tolerance for unstructured situations and deviant ideas or behaviors.
China and Germany show similar scores for Individualism, while the Hofstede scores are
quite different (Germany being much higher). As mentioned above, the Chinese scores seem
to have adapted to the German scores. China and the USA have similar scores on the culture
dimension Power Distance, yet the Hofstede scores are different for these two countries. The
test persons from both countries agree that the cultural influence shaped the phone towards
being formal and hard to approach. This result differs from the American perspective more
than from the Chinese.
4. Conclusion and outlook
Due to the small number of participants the results should not be over-generalized; however,
they do show similarities and differences between the countries. Especially interesting is that
all modules that were used to assess user feedback on the product showed culture-related
results. This will allow us to analyze cultural similarities and differences more in depth than
could be achieved in this study.
Finding differences in usage, impression and rating of a product confirms not only earlier
research findings, but also the strategy of global companies to really draw the consequence to
have local usability efforts. We also experienced the consequence that this decentralized work
results in a higher communication and synchronization effort. However because this study
was to a large extent detached from the development process these issues were not a
significant hindrance.
The newly developed semantic differential scale provided some interesting results. It proved
to be of value to assess general impressions about the phone. Independent from that it could
additionally be used as a tool that allows an intercultural interpretation of the responses. To
advance from the exploratory nature of the scale towards a scientific tool, the internal validity
needs to be further addressed in next research steps.
Due to time constraints much of the study data could not be analyzed. This will be carried out
together with applying the methodology and tools to other product domains.
5. References
Hassenzahl, M. (2002). The effect of perceived hedonic quality on product appealingness.
International Journal of Human-Computer Interaction, 13, pp. 479-497.
Hofstede, G. (1981). Culture's consequences: International differences in work-related values.
Newbury Park, CA: Sage.
Hofstede, G. (1991). Cultures and Organizations: Software of the Mind: Intercultural
Cooperation and its Importance for Survival. New York: McGraw-Hill.
ISO 9241-11 (Issue 1993). Ergonomic requirements for office work with visual display
terminals (VDTs) - Guidance on usability. Berlin: Beuth.
Marcus, A. (2001). Cultural Dimensions and Global Web
http://www.amanda.com/resources/hfweb2000/AMA_CultDim.pdf
Design,
AM+A,
Röse, K. (2002). Models of culture and their applicability for designing user interfaces. In:
Luczak, H.; Cakir, A.: WWDU 2002, Work With Display Units, World Wide Work.
Proceedings of the 6th International Scientific Conference on Work with Display Units,
Berchtesgaden, May 22-25, 2002, pp. 319-321.
Trompenaars, F. (1993). Riding the waves of culture: understanding cultural diversity in
business. London: The Economist Press.