BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Two generations of Many-Core Computational Arrays - Bevan Baas\, U
 C Davis
DTSTART:20080925T150000Z
DTEND:20080925T160000Z
UID:TALK13312@talks.cam.ac.uk
CONTACT:Robert Mullins
DESCRIPTION:The Asynchronous Array of Simple Processors (AsAP) is a progra
 mmable\n and reconfigurable processing system that: enables high throughpu
 t\n and high energy-efficiency\, is well matched to workloads containing\n
  many varied DSP tasks\, and is well suited for deep submicron VLSI\n fabr
 ication technologies.\nThe AsAP platform is composed of a large number of 
 programmable\n "reduced complexity" processing elements designed to captur
 e the\n targeted task kernels but with very little additional overhead.\n 
 Processors contain individual digitally-tunable clock oscillators\n operat
 ing completely independently with respect to each other\n (GALS)\, and pro
 cessors communicate through a reconfigurable full-rate\n 2-D mesh network.
   Individual clock oscillators fully halt in 9\n cycles when there is no w
 ork to do\, and restart at full speed in\n less than one cycle after work 
 becomes available.\nA chip containing 36 programmable processors was fabri
 cated in\n 0.18 um CMOS using standard cells and is fully functional.  Eac
 h\n 0.66 mm^2 processor operates up to 610 MHz at 2.0 V and dissipates\n 3
 2 mW average at 475 MHz and 1.8 V\, and 2.4 mW at 116 MHz and 0.9 V\n whil
 e executing applications. [ISSCC06]\nSeveral dozen DSP and general tasks h
 ave been coded including\n 32-1024 point complex FFTs\, a k=7 viterbi deco
 der\, a JPEG encoder\,\n a full-rate HDTV H.264 CAVLC encoder\, and a full
 y-compliant\n IEEE 802.11a/11g wireless LAN baseband transmitter and recei
 ver.\n Power\, throughput\, and area results compare very well with existi
 ng\n programmable DSP processors. A recently completed C compiler and\n au
 tomatic mapping tool greatly simplify programming.\nA second generation 65
  nm CMOS design contains 167 processors and has\n many new architectural f
 eatures including dedicated FFT\, Viterbi\,\n and video motion estimation 
 processors\; 16 KB shared memories\;\n and long-distance inter-processor i
 nterconnect.  The programmable\n processors are able to individually and d
 ynamically change their\n supply voltage (choosing among VddHi\, VddLo\, o
 r disconnected)\n and clock frequency.  The chip is fully-functional with 
 early\n measurements showing the programmable processors operating up to\n
  1.2 GHz while dissipating 59 mW at 1.3 V.  At a supply voltage\n of 0.675
  V\, they operate at 66 MHz while dissipating only 608 uW.
LOCATION:SS03\, Computer Laboratory\, William Gates Building
END:VEVENT
END:VCALENDAR
