VELOX publication receives award for "Best Paper" at the 11th IEEE International Conference on HPCC 09

25/06/2009

Sutirtha Sanyal, whose work was partially funded by the VELOX Project won the Best Paper Award for his submission "Dynamically Filtering Thread-Local Variables in Lazy-Lazy Hardware Transactional Memory".  His paper was one of only 57 papers accepted out of the over 240 submissions to the 11th IEEE International Conference on HPCC 09. The Best Paper Award honors the author(s) of papers of exceptional merit in the areas of High Performance Computing and Communications and is based on general quality, originality, contributions, subject matter and timeliness.

Sutirtha Sanyal received a Masters degree in VLSI Design Tools and Technology from the Indian Institute of Technology in Delhi, India and a Bachelors degree in Electronics and Tele-Communication from Jadavpur University, Kolkata, India. He has 5 years of industrial experience working on the design teams such companies as Intel, Synopsys and LG. He has spent the past 2 years pursuing his research interests in the architectural aspect of the Processor/ASIC design.

Abstract—Transactional Memory (TM) is an emerging technology which promises to make parallel programming easier.However, to be efficient, underlying TM system should protect only true shared data and leave thread-local data out of the transaction. This speed-up the commit phase of the transaction which is a bottleneck for a lazily versioned HTM.

In this work we propose a scheme in the context of a lazy-lazy (lazy conflict detection and lazy data versioning) Hardware Transactional Memory (HTM) system to identify dynamically variables which are local to a thread and exclude them from the commitset of the transaction. Our proposal covers sharing of both stack and heap but also filters out local accesses to both of them. We also propose, in the same scheme, to identify local variables for which versioning need not be maintained.

For evaluation, we have implemented a lazy-lazy model of HTM in line with the conventional and the scalable version of the TCC in a full system simulator. For operating system, we have modified the Linux kernel. We got an average speed-up of 31% for the conventional TCC, on applications from the STAMP benchmark suite. For the scalable TCC we got an average speedup of 16%. Also, we found that on average 99% of the local variables can be safely omitted when recording their old values to handle aborts.

More information: http://www.velox-project.eu/sites/default/files/hpcc.pdf