Nuno Diegues and Paolo Romano and Luís Rodrigues

Transcrição

Nuno Diegues and Paolo Romano and Luís Rodrigues
to appear on the 23rd Conference on Parallel Architectures and Compilation Techniques (PACT 2014)
On the Energy and Performance of Commodity
Hardware Transactional Memory
Nuno Diegues and Paolo Romano and Luís Rodrigues
Transactional Memory (TM)
Now also in hardware: x86 extensions in Intel Core processors
with Transactional Synchronization Extensions (TSX)
• multi-core processors are standard
• locking approaches are complex
• traditional, pessimistic approach
• TM is an abstraction for synchronization
• requires per-application effort
• programmers identify atomic blocks
synchronization in hardware is efficient
• runtime implements the synchronization
but limited, not suited for all workloads
atomic {
transactions may abort due to:
best-effort nature:
withdraw(acc1,val)
deposit(acc2,val)
• many spurious aborts
}
• requires fallback path
• forbidden instructions
• capacity of L1 cache
• faults and signals
• (besides contention to data)
Research question: how does commodity HTM fare against existing available alternatives?
Software TMs
Hybrid TMs
(4 impls)
Locks
(2 impls)
(6 impls)
• instrumented reads and writes
• use STM on fallback of HTM
• traditional, pessimistic approach
• software runtime validates transactions
• long, conflicting txs use STM
• requires per-application effort
easy prototyping, robust implementations
best of both worlds?
can be highly optimised
overhead of software concurrency control
complex integration of strategies
complex and not composable
• STM is still competitive
• despite commodity hardware
• best STM was better than HTM in 71% of the scenarios
Summary
of results
• Hybrid TMs inherit worst of both worlds
• never the best approach in any scenario
560 scenarios using TM standard benchmarks
HTM is a good fit
for concurrent
data structures
Hash Map 10% writes
80
Research Directions
% of txs aborted
Tuning HTM
optimal configuration
up to 80% better than
0
1
4 8
TSX
1
4 8 #threads
HYBRID TM
best configuration on average
also to appear on ICAC’14
Selective Instrumentation
20% impact on STM and
Hybrid approaches
speedup
Hybrid TMs
Red-Black Tree 90% writes
manual
compiler
#threads
This work was supported by PEst-OE/EEI/LA0021/2013 from Fundação para a Ciência e Tecnologia in Portugal and by the GreenTM project (EXPL/EEI-ESS/0361/2013).

Documentos relacionados