Sunday, February 16th:

6:00pm - 8:00pm Welcome Reception (Plaza G)

Breakfast will be provided from 8:00 am ~ 9:30 am at Plaza G between Feb. 17th and Feb. 19th.

Monday, February 17th:

Time Session A Session B
8:30am - 10:00am

Opening and Keynote I: Mark Hill Keynotes (Florida B+C)

10:00am - 10:20am
Break
10:20am - 12:00pm Session 1A: Caches (Plaza E) Session 1B: Reliability and Process Variation (Plaza F)
12:00pm - 1:30pm
Lunch (Plaza G)
1:30pm - 2:45pm Session 2A: Race Detection and Instruction Monitoring (Plaza E)

Session 2B: Datacenters (Plaza F)

2:45pm - 3:15pm
Break
3:15pm - 4:30pm Session 3A: Coherence and Consistency (Plaza E) Session 3B: Best of CAL (Plaza F)
4:30pm - 4:45pm
Break
4:45pm - 6:25pm Session 4A: Security and Cloning (Plaza E) Session 4B: GPUs (Plaza F)
6:30pm - 8:00pm TCCA Business Meeting (Plaza E)  

Tuesday, February 18th:

Time Session A Session B
8:05am - 9:45am Session 5A: Interconnection Networks (Plaza E) Session 5B: DRAM (Plaza F)
9:45am - 10:05am Break
10:05am - 11:45am Session 6: Best Paper I (Plaza E+F)
11:45am - 1:15pm Lunch (Plaza G)
1:15pm - 2:30pm Session 7: Best Paper II (Plaza E+F)
2:30pm - 2:45pm Break
2:45pm - 4:00pm Session 8A: Industrial Track (Plaza E) Session 8B: Non-Volatile Memory (Plaza F)
4:00pm Excursion

Wednesday, February 19th:

Time Session A Session B
8:30am - 9:40am Keynote II: Norm Rubin Keynotes (Florida B+C)
9:40am - 9:45am Room shift
9:45am - 11:00am Session 9A: Memory Management (Plaza E) Session 9B: Power (Plaza F)
11:00am - 11:15am Break
11:15am - 12:30pm Session 10A: Prefetching and Compression (Plaza E) Session 10B: Threading (Plaza F)
12:30pm - 12:45pm Closing Remarks (Plaza E)

Full Program:

Session 1A: Caches (Plaza E)

Session Chair: Gabriel Loh, AMD Research

1. Locality-Aware Data Replication in the Last-Level Cache
George Kurian (Massachusetts Institute of Technology), Srinivas Devadas (Massachusetts Institute of Technology), Omer Khan (University of Connecticut)

2. Adaptive Placement and Migration Policy for an STT-RAM-Based Hybrid Cache
Zhe Wang (Texas A&M University), Daniel A. Jimenez (Texas A&M University), Cong Xu (Pennsylvania State University), Guangyu Sun (Peking University), Yuan Xie (Pennsylvania State University/AMD Research)

3. DASCA: Dead Write Prediction Assisted STT-RAM Cache Architecture
Junwhan Ahn (Seoul National University), Sungjoo Yoo (Pohang University of Science and Technology), Kiyoung Choi (Seoul National University)

4. A Detailed GPU Cache Model Based on Reuse Distance Theory
Cedric Nugteren (Eindhoven University of Technology), Gert-Jan van den Braak (Eindhoven University of Technology), Henk Corporaal (Eindhoven University of Technology), Henri Bal (Vrije Universiteit Amsterdam)

Session 1B: Reliability and Process Variation (Plaza F)

Session Chair: Ramon Canal, UPC

1. Precision-Aware Soft Error Protection for GPUs
David J. Palframan (University of Wisconsin-Madison), Nam Sung Kim (University of Wisconsin-Madison), Mikko H. Lipasti (University of Wisconsin-Madison)

2. Understanding the Impact of Gate-Level Physical Reliability Effects on Whole Program Execution
Raghuraman Balasubramanian (University of Wisconsin-Madison), Karthikeyan Sankaralingam (University of Wisconsin-Madison)

3. Accordion: Toward Soft Near-Threshold Voltage Computing
Ulya R. Karpuzcu (University of Minnesota), Ismail Akturk (University of Minnesota), Nam Sung Kim (University of Wisconsin-Madison)

4. Mosaic: Exploiting the Spatial Locality of Process Variation to Reduce Refresh Energy in On-Chip eDRAM Modules
Aditya Agrawal (University of Illinois at Urbana-Champaign), Amin Ansari (University of Illinois at Urbana- Champaign), Josep Torrellas (University of Illinois at Urbana-Champaign)

Session 2A: Race Detection and Instruction Monitoring (Plaza E)

Session Chair: Paul Gratz, Texas A&M University

1. Low-Overhead and High Coverage Run-Time Race Detection Through Selective Meta-data Management
Ruirui Huang (Intel), Erik Halberg (Cornell University), Andrew Ferraiuolo (Cornell University), G. Edward Suh (Cornell)

2. FADE: A Programmable Filtering Accelerator for Instruction-Grain Monitoring
Sotiria Fytraki (EPFL), Evangelos Vlachos (Oracle Labs), Onur Kocberber (EPFL), Babak Falsafi (EPFL), Boris Grot (University of Edinburgh)

3. Dynamically Detecting and Tolerating IF-Condition Data Races
Shanxiang Qi (Google), Abdullah A. Muzahid (University of Texas at San Antonio), Wonsun Ahn (University of Illinois at Urbana-Champaign), Josep Torrellas (University of Illinois at Urbana-Champaign)

Session 2B: Datacenters (Plaza F)

Session Chair: John Kim, KAIST

1. Exploiting Thermal Energy Storage to Reduce Data Center Capital and Operating Expenses
Wenli Zheng (The Ohio State University), Kai Ma (The Ohio State University), Xiaorui Wang (The Ohio State University)

2. Implications of High Energy Proportional Servers on Cluster-wide Energy Proportionality
Daniel Wong (University of Southern California), Murali Annavaram (University of Southern California)

3. Strategies for Anticipating Risk in Heterogeneous System Design
Marisabel Guervara (Duke University), Benjamin Lubin (Boston University) and Benjamin C. Lee (Duke University)

Session 3A: Coherence and Consistency (Plaza E)

Session Chair: Lingjia Tang, University of Michigan

1. TSO-CC: Consistency directed cache coherence for TSO
Marco Elver (University of Edinburgh), Vijay Nagarajan (University of Edinburgh)

2. Stash Directory: A Scalable Directory for Many-Core Coherence
Socrates Demetriades (University of Pittsburgh), Sangyeun Cho (University of Pittsburgh)

3. QuickRelease: A Throughput Oriented Approach to Release Consistency on GPUs
Blake Hechtman (Duke University/AMD Research), Brad Beckmann (AMD Research), Derek Hower (Qualcomm Research), Mark Hill (University of Wisconsin-Madison), David Wood (University of Wisconsin-Madison), Steve Reinhardt (AMD Research), Shuai Che (AMD Research), Yingying Tian (Texas A&M, AMD Research)

Session 3B: Best of CAL (Plaza F)

Session Chair: José Martínez, Cornell University

1. The Netflix Challenge: Datacenter Edition, January-June 2013 (vol. 12 no. 1), pp. 29-32
Christina Delimitrou (Stanford), Christos Kozyrakis (Stanford)

2. High Performance, Energy Efficient Chipkill Correct Memory with Multidimensional Parity, July-Dec. 2013, (vol. 12 no. 2), pp. 39-42
Xun Jian (University of Illinois at Urbana-Champaign) John Sartori (University of Illinois at Urbana-Champaign), Henry Duwe (University of Illinois at Urbana-Champaign), Rakesh Kumar (University of Illinois at Urbana-Champaign)

3. Shrink-Fit: A Framework for Flexible Accelerator Sizing, January-June 2013 (vol. 12 no. 1), pp. 17-20
Michael Lyons (Harvard University), Gu-Yeon Wei (Harvard University), David Brooks (Harvard University)

4. Clumsy Flow Control for High-Throughput Bufferless On-Chip Networks, July-Dec. 2013 (vol. 12 no. 2), pp. 47-50
Hanjoon Kim (KAIST), Yonggon Kim (KAIST), John Kim (KAIST)

Session 4A: Security and Cloning (Plaza E)

Session Chair: Daniel Sorin, Duke University

1. A Non-Inclusive Memory Permissions Architecture for Protection Against Cross-Layer Attacks
Jesse Elwell (SUNY Binghamton), Ryan Riley (Qatar University), Nael Abu-Ghazaleh (SUNY Binghamton), Dmitry Ponomarev (SUNY Binghamton)

2. Suppressing the Oblivious RAM Timing Channel While Making Information Leakage and Program Efficiency Trade-offs
Christopher W. Fletcher (Massachusetts Institute of Technology), Ling Ren (Massachusetts Institute of Technology), Xiangyao Yu (Massachusetts Institute of Technology), Marten Van Dijk (University of Connecticut), Omer Khan (University of Connecticut), Srinivas Devadas (Massachusetts Institute of Technology)

3. Timing Channel Protection for Memory Controllers
Yao Wang (Cornell University), Andrew Ferraiuolo (Cornell University), G. Edward Suh (Cornell University)

4. STM : Cloning the Spatial and Temporal Memory Access Behavior
Amro Awad (North Carolina State University), Yan Solihin (North Carolina State University)

Session 4B: GPUs (Plaza F)

Session Chair: Vijay Janapa Reddi, University of Texas at Austin

1. A Scalable Multi-Path Microarchitecture for Efficient GPU Control Flow
Ahmed ElTantawy (University of British Columbia), Jessica Wenjie Ma (University of British Columbia), Mike O'Connor (NVIDIA/University of Texas at Austin), Tor Aamodt (University of British Columbia)

2. Improving GPGPU Resource Utilization and Performance Through Alternative Thread Block Scheduling
Minseok Lee (KAIST), Sukwoo Song (KAIST), Joosik Moon (KAIST), John Kim (KAIST), Woong Seo (Samsung), Yeongon Cho (Samsung), Soojung Ryu (Samsung)

3. MRPB: Memory Request Prioritization for Massively Parallel Processors
Wenhao Jia (Princeton University), Kelly A. Shaw (University of Richmond), Margaret Martonosi (Princeton University)

4. Warp-Level Divergence in GPUs: Characterization, Impact and Mitigation
Ping Xiang (North Carolina State University), Yi Yang (NEC Labs), Huiyang Zhou (North Carolina State University)

Session 5A: Interconnection Networks (Plaza E)

Session Chair: George Michelogiannakis, Lawrence Berkeley National Laboratory

1. MP3: Minimizing Performance Penalty for Power-gating of Clos Network-on-Chip
Lizhong Chen (University of Southern California), Lihang Zhao (University of Southern California), Ruisheng Wang (University of Southern California), Timothy Mark Pinkston (University of Southern California)

2. Up By Their Bootstraps: Online Learning in Artificial Neural Networks for CMP Uncore Power Management
Jae-Yeon Won (Texas A&M University), Xi Chen (Texas A&M University), Paul V. Gratz (Texas A&M University), Jiang Hu (Texas A&M University), Vassos Soteriou (Cyprus University of Technology)

3. QORE: A Fault Tolerant Network-on-Chip Architecture with Power-Efficient Quad-Function Channel (QFC) Buffers
Dominic DiTomaso (Ohio University), Avinash Kodi (Ohio University), Ahmed Louri (The University of Arizona)

4. Transportation-Network Inspired Network-on-Chip
Hanjoon Kim (KAIST), Gwangsun Kim (KAIST), Hwasoo Yeo (KAIST), Seungryoul Maeng (KAIST), John Kim (KAIST)

Session 5B: DRAM (Plaza F)

Session Chair: Aamer Jaleel, Intel

1. Improving System Throughput and Fairness Simultaneously in CMP Systems via Dynamic Bank Partitioning
Mingli Xie (Peking University), Dong Tong (Peking University), Kan Huang (Peking University), Xu Cheng (Peking University)

2. Improving DRAM Performance by Parallelizing Refreshes with Accesses
Kevin Kai-Wei Chang (Carnegie Mellon University), Donghyuk Lee (Carnegie Mellon University), Zeshan Chishti (Intel), Chris Wilkerson (Intel), Alaa Alameldeen (Intel), Yoongu Kim (Carnegie Mellon University), Onur Mutlu (Carnegie Mellon University)

3. CREAM: A Concurrent-Refresh-Aware DRAM Memory System
Tao Zhang (Pennsylvania State University), Matt Poremba (Pennsylvania State University), Cong Xu (Pennsylvania State University), Guangyu Sun (Peking University), Yuan Xie (Pennsylvania State University/AMD Research)

4. DraMon: Predicting Memory Bandwidth Usage of Multi-threaded Programs with High Accuracy and Low Overhead
Wei Wang (University of Virginia), Tanima Dey (University of Virginia), Jack Davidson (University of Virginia), Mary Lou Soffa (University of Virginia)

Session 6: Best Paper I (Plaza E+F)

Session Chair: Murali Annavaram, University of Southern California

1. PVCoherence: Designing Flat Coherence Protocols for Scalable Verification
Meng Zhang (Duke University), Jesse D. Bingham (Intel), John Erickson (Intel), Daniel J. Sorin (Duke University)

2. Atomic SC for Simple In-order Processors
Dibakar Gope (University of Wisconsin-Madison), Mikko H. Lipasti (University of Wisconsin-Madison)

3. Concurrent and Consistent Virtual Machine Introspection with Hardware Transactional Memory
Yutao Liu (Shanghai Jiao Tong University), Yubin Xia (Shanghai Jiao Tong University), Haibing Guan (Shanghai Jiao Tong University), Binyu Zang (Shanghai Jiao Tong University), Haibo Chen (Shanghai Jiao Tong University)

4. Practical Data Value Speculation for Future High-end Processors
Arthur Perais (INRIA), André Seznec (INRIA)

Session 7: Best Paper II (Plaza E+F)

Session Chair: Ahmed Louri, University of Arizona

1. Tangle: Route-Oriented Dynamic Voltage Minimization for Variation-Afflicted, Energy-Efficient On-Chip Networks
Amin Ansari (University of Illinois at Urbana-Champaign), Asit Mishra (Intel), Jianping Xu (Intel), Josep Torrellas (University of Illinois at Urbana-Champaign)

2. Improving Cache Performance by Exploiting Read-Write Disparity
Samira Khan (Intel/Carnegie Mellon University), Alaa R. Alameldeen (Intel), Chris Wilkerson (Intel), Onur Mutlu (Carnegie Mellon University), Daniel A. Jimenez (Texas A&M University)

3. NUAT: A Non-Uniform Access Time Memory Controller
Wongyu Shin (KAIST), Jeongmin Yang (KAIST), Jungwhan Choi (KAIST), Lee-Sup Kim (KAIST)

Session 8A: Industrial Track (Plaza E)

Session Chair: Yasuko Eckert, AMD Research

1. Improving In-Memory Database Index Performance with Intel® Transactional Synchronization Extensions
Tomas Karnagel (TU Dresden), Roman Dementiev (Intel), Ravi Rajwar (Intel), Konrad Lai (Intel), Thomas Legler (SAP AG), Benjamin Schlegel (TU Dresden), Wolfgang Lehner (TU Dresden)

2. BigDataBench: a Big Data Benchmark Suite from Internet Services
Lei Wang (ICT, Chinese Academy of Sciences), Chunjie Luo (ICT, Chinese Academy of Sciences), Yongqiang He (Facebook), Jianfeng Zhan (ICT, Chinese Academy of Sciences), Kent Zhan (Tencent), Xiaona Li (Baldu), Yuqing Zhu (ICT, Chinese Academy of Sciences), Shujie Zhang (Huawei), Qiang Yang (ICT, Chinese Academy of Sciences), Bizhu Qiu (Yahoo!), Zhen Jia (ICT, Chinese Academy of Sciences)

3. 3D Stacking of High-Performance Processors
Philip Emma (IBM), Alper Buyuktosunoglu (IBM), Michael Healy (IBM), Krishnan Kailas (IBM), Valentin Puente (IBM), Roy Yu (IBM), Allan Hartstein (IBM), Pradip Bose (IBM), Jaime Moreno (IBM)

Session 8B: Non-volatile memory (Plaza F)

Session Chair: Jun Yang, University of Pittsburgh

1. Reducing the Cost of Persistence for Nonvolatile Heaps in End User Devices
Sudarsun Kannan (Georgia Institute of Technology), Ada Gavrilovska (Georgia Institute of Technology), Karsten Schwan (Georgia Institute of Technology)

2. Sprinkler: Maximizing Resource Utilization in Many-Chip Solid State Disks
Myoungsoo Jung (The University of Texas at Dallas), Mahmut Kandemir (Pennsylvania State University)

3. Over-Clocked SSD: Safely Running Beyond Flash Memory Chip I/O Clock Specs
Kai Zhao (Rensselaer Polytechnic Institute), Kalyana Venkataraman (Cavium), Xuebin Zhang (Rensselaer Polytechnic Institute), Jiangpeng Li (Shanghai Jiao Tong University), Ning Zheng (Rensselaer Polytechnic Institute), Tong Zhang (Rensselaer Polytechnic Institute)

Session 9A: Memory Management (Plaza E)

Session Chair: Rajeev Balasubramonian, University of Utah

1. GPUdmm: A High-Performance and Memory-Oblivious GPU Architecture Using Dynamic Memory Management
Youngsok Kim (POSTECH), Jaewon Lee (POSTECH), Jae-Eon Jo (POSTECH), Jangwoo Kim (POSTECH)

2. Increasing TLB Reach by Exploiting Clustering in Page Translations
Binh Pham (Rutgers University), Abhishek Bhattacharjee (Rutgers University), Yasuko Eckert (AMD Research), Gabriel H. Loh (AMD Research)

3. Supporting x86-64 Address Translation for 100s of GPU Lanes
Jason Power (University of Wisconsin-Madison), Mark D. Hill (University of Wisconsin-Madison), David A. Wood (University of Wisconsin-Madison)

Session 9B: Power (Plaza F)

Session Chair: Ulya Karpuzcu, University of Minnesota

1. Scalably Verifiable Dynamic Power Management
Opeoluwa Matthews (Duke University), Meng Zhang (Duke University), Daniel J. Sorin (Duke University)

2. Revolver: Processor Architecture for Power Efficient Loop Execution
Mitchell Hayenga (ARM), Mikko H. Lipasti (University of Wisconsin-Madison), Vignyan Reddy (University of Wisconsin-Madison)

3. Dynamic Management of TurboMode in Modern Multi-core Chips
David Lo (Stanford), Christos Kozyrakis (Stanford)

Session 10A: Prefetching and Compression (Plaza E)

Session Chair: Xiangyu Dong, Qualcomm

1. Spare Register Aware Prefetching for Graph Algorithms on GPUs
Nagesh B. Lakshminarayana (Georgia Institute of Technology), Hyesoon Kim (Georgia Institute of Technology)

2. Sandbox Prefetching: Safe, Run-Time Evaluation of Aggressive Prefetchers
Seth Pugsley (University of Utah), Zeshan Chishti (Intel), Chris Wilkerson (Intel), Troy Chuang (Intel), Robert Scott (Intel), Aamer Jaleel (Intel), Shih-Lien Lu (Intel), Kingsum Chow (Intel), Rajeev Balasubramoniam (University of Utah)

3. Memzip: Exploiting Unconventional Benefits from Memory Compression
Ali Shafiee (University of Utah), Meysam Taassori (University of Utah), Rajeev Balasubramonian (University of Utah), Al Davis (University of Utah)

Session 10B: Threading (Plaza F)

Session Chair: Mark Hempstead, Drexel University

1. CDTT: Compiler-generated data-triggered threads
Hung-Wei Tseng (University of California, San Diego), Dean Tullsen (University of California, San Diego)

2. Accelerating Decoupled Look-ahead via Weak Dependence Removal: A Metaheuristic Approach
Raj Parihar (University of Rochester), Michael C. Huang (University of Rochester)

3. Undersubscribed Threading for High-Performance and Energy-Efficient Many-Core Execution
Wim Heirman (Ghent University), Trevor Carlson (Ghent University), Kenzo Van Craeynest (Ghent University), Ibrahim Hur (Intel), Aamer Jaleel (Intel), Lieven Eeckhout (Ghent University)