Nuno Diegues and Paolo Romano and Luís Rodrigues
Transcrição
Nuno Diegues and Paolo Romano and Luís Rodrigues
to appear on the 23rd Conference on Parallel Architectures and Compilation Techniques (PACT 2014) On the Energy and Performance of Commodity Hardware Transactional Memory Nuno Diegues and Paolo Romano and Luís Rodrigues Transactional Memory (TM) Now also in hardware: x86 extensions in Intel Core processors with Transactional Synchronization Extensions (TSX) • multi-core processors are standard • locking approaches are complex • traditional, pessimistic approach • TM is an abstraction for synchronization • requires per-application effort • programmers identify atomic blocks synchronization in hardware is efficient • runtime implements the synchronization but limited, not suited for all workloads atomic { transactions may abort due to: best-effort nature: withdraw(acc1,val) deposit(acc2,val) • many spurious aborts } • requires fallback path • forbidden instructions • capacity of L1 cache • faults and signals • (besides contention to data) Research question: how does commodity HTM fare against existing available alternatives? Software TMs Hybrid TMs (4 impls) Locks (2 impls) (6 impls) • instrumented reads and writes • use STM on fallback of HTM • traditional, pessimistic approach • software runtime validates transactions • long, conflicting txs use STM • requires per-application effort easy prototyping, robust implementations best of both worlds? can be highly optimised overhead of software concurrency control complex integration of strategies complex and not composable • STM is still competitive • despite commodity hardware • best STM was better than HTM in 71% of the scenarios Summary of results • Hybrid TMs inherit worst of both worlds • never the best approach in any scenario 560 scenarios using TM standard benchmarks HTM is a good fit for concurrent data structures Hash Map 10% writes 80 Research Directions % of txs aborted Tuning HTM optimal configuration up to 80% better than 0 1 4 8 TSX 1 4 8 #threads HYBRID TM best configuration on average also to appear on ICAC’14 Selective Instrumentation 20% impact on STM and Hybrid approaches speedup Hybrid TMs Red-Black Tree 90% writes manual compiler #threads This work was supported by PEst-OE/EEI/LA0021/2013 from Fundação para a Ciência e Tecnologia in Portugal and by the GreenTM project (EXPL/EEI-ESS/0361/2013).