Autovision - ein rekonfigurierbares Rechensystem für
Transcrição
Autovision - ein rekonfigurierbares Rechensystem für
Autovision - ein rekonfigurierbares Rechensystem für Video-basierte Fahrerassistenzsysteme im Automobil Christopher Claus Walter Stechele Andreas Herkersdorf June, 2nd 2006 Technische Universität 80290 München Theresienstr. 90 Building N1, 2nd floor www.lis.ei.tum.de Overview -Motivation of Autovision -Autovision & Coprocessors - Video Interface ML310 Basic Pixel Processing Engine Edge Engine Contrast Engine Luminanz Engine Tunnel Entrance Detection - Reconfiguration times Example Systems Early Access Partial Reconfiguration (EAPR) Combitgen Current Problems & possible solutions -Dynamic Partial Reconfiguration -Cooperations & Publications -Outlook AutoVision - 2 © Lehrstuhl für Integrierte Systeme AutoVision Processor Xilinx Virtex-II Pro Region to enhance contrast PPC mem Engine PPC Mem I/F Classifier Engine Luminanz Engine Shape Engine xxxx Engine Video I/F Bus mem Engine mem mem reconfig Shape Engine Highway Tunnel entrance Inside tunnel AutoVision - 3 Contrast Engine Edge Engine Luminanz Engine X PPC X X X X X X © Lehrstuhl für Integrierte Systeme Video interface System Overview -Hardware Setup 6MBit = 2,44 Frames 640 * 480 Pixel * 8 Bit Frame Pixel SW Video 4 Linux driver webcam driver framegrabber AutoVision - 4 © Lehrstuhl für Integrierte Systeme Basic Pixel Processing Engine PLB Config. Registers Input Transport IPIF Cache Memory Arbiter Matrix FIFO User Process ContrastEng EdgeEng AutoVision - 5 © Lehrstuhl für Integrierte Systeme Edge Engine -Edge Detection -Lane Detection (Hough transformation) -Detection and labeling of horizontal lines -> obstacles AutoVision - 6 © Lehrstuhl für Integrierte Systeme Edge Engine Performance EdgeEngine 2.389 ms Edge PPC PPC PPC Edge-Filtered Output QVGA Grayscale Input 320 x 240 Pixels 8 bit/pixel 26.590 ms PPC AutoVision - 7 PPC © Lehrstuhl für Integrierte Systeme Contrast Engine Region to enhance contrast Sample window to get Ymax and Ymin 255 255 Ymax Ymin 0 AutoVision - 8 © Lehrstuhl für Integrierte Systeme 0 Luminanz Engine - Detection of pairs of lights, thresholding and finding regions (in HW) - Grouping of lights, detection of license plates and classification done in SW (flexibility) AutoVision - 9 © Lehrstuhl für Integrierte Systeme Tunnel Entrance Detection -Algorithm searches for tunnel entrances (black holes) -Lane Detection to verify that found entrance is correct -Mark found tunnel as Region of interest -Currently implemented in Matlab –> future HW version with System Generator AutoVision - 10 © Lehrstuhl für Integrierte Systeme Reducing Reconfiguration Times Motivation: 40ms to process one image. If image processing can be done in 35 ms -> replace engines in 5 ms -> no frames dropped Possibilities to shorten rec. times 1. # Frames to be written ~ rec. Time (Combitgen) 2. How are frames written to config. memory -> reduce overhead (Combitgen) 3. Memory Access speed -> Steffen Toscher, Magdeburg AutoVision - 11 © Lehrstuhl für Integrierte Systeme Example: Adder/Subtractor Adder/ Subtractor Frames to be recon. I C A P -Small Program running on PPC is writing input values over OPB to registers. -Bitstreams are stored on CompactFlash card. -> loaded into DDR during Initialization -Internal Configuration Access Port (ICAP) handels the reconfiguration from ‚inside‘. Bitstreams are loaded from DDR -> ICAP -> Reconfiguration. -Adder can be replaced by subtractor during runtime and vice versa while static part remains operational. -Disadvantage: routes that cross reconfigurable area -> busmacros have to be used -> Timing problems AutoVision - 12 © Lehrstuhl für Integrierte Systeme What is EAPR The requirement for wholecolumn PR regions is removed. The Early Access Partial Reconfiguration (EAPR) flow now allows for PR regions of any rectangular size EA PR flow allows signals (routes) in the base design to cross through a partially reconfigurable region without using a bus macro. static rec. static BM BM AutoVision - 13 © Lehrstuhl für Integrierte Systeme Example EAPR Design -No more busmacros for connections between static areas - Complete peripherals are exchanged -Timing advantage - Design handcrafted / - DCMs must be instatiated on toplevel manually - Automize and optimize EAPR flow - Use Combitgen as add-on for EAPR Flow AutoVision - 14 © Lehrstuhl für Integrierte Systeme Configuration Details (1) Frame #1 Frame #2 Regular configuration over FDRI: -write operation pipelined through the frame buffer -Every write process ends with one frame of pad data (206 x 32 bit) Source: VirtexII-Pro and VirtexII-Pro X FPGA User Guide AutoVision - 15 © Lehrstuhl für Integrierte Systeme Configuration Details (2) Configuration over MFWR: -Write one frame to multiple addresses BRAM CLBS Identical frame BRAM Identical frame CLBS Identical frame -Every write process needs 13 x 32 bit words -Single frames -> regular method identical frames -> MFWR -Single frames -> MFWR identical frames -> MFWR MFWR Register AutoVision - 16 © Lehrstuhl für Integrierte Systeme Combitgen Toplevel Bitstreams 1 2 Partial Bitstreams PseudoFPGA-Memory 1 1 = 2 3 3 2 3 Differential Frame-Bitmap - CRC calculation - Single MFWR AutoVision - 17 © Lehrstuhl für Integrierte Systeme Adder/Subtractor Results Rek. Flow From slice # To slice # Size in Byte Frames to write Rec.Time (ms) 61,9 Partial Mask 70 80 65325 218 Part. M. compr. 70 80 65325 218 Diff. Based 70 Combitgen 70 Factor 3,2 80 Factor 10 27578 22 Factor 3,2 61,9 26,1 Factor 1,4 80 Factor 1,4 20306 22 19,2 EAPR bitstream (bytes) Combitgen bitstream (bytes) Adder 222.301 Subtractor 218.285 Factor 2.3 AutoVision - 18 95.774 93.466 © Lehrstuhl für Integrierte Systeme Current problems - ML310 linux-PCI base design not suited for the EAPR flow PCI occupies almost one third of XC2VP30 3 fps instead of 25 fps due to USB 1.1 support Not possible to run two engines in parallel due to phys. Resources alternative: ESM AutoVision - 19 © Lehrstuhl für Integrierte Systeme Possible Solutions (XUP Board) XSGA Video Ouput Video decoder Board - Same FPGA Every V2P EAPR Design was implemented on this board 25 fps possible through Video decoder board On-Board XSGA Video output, no need for PCI, more phys. resources AutoVision - 20 © Lehrstuhl für Integrierte Systeme Cooperations Prof. Dr.-Ing. Jürgen Becker, Dipl.-Ing. Michael Hübner, Karlsruhe: - knowledgetransfer, student exchange, common activities - „On-Line und On-Chip Visualisierung von dynamischer und partieller Rekonfiguration“ - Generating compressed bitstreams with Combitgen and use decompressor unit from KA Prof. Dr. Roland Kasper, Dipl.-Ing. Steffen Toscher, Magdeburg: - „Methoden für die Hardware-ICAP Ansteuerung“ direct connection between Magdeburg ICAP and main memory to speed up reconfiguration times Prof. Dr.-Ing. Jürgen Teich, Dipl.-Ing. Mateusz Majer, Erlangen: - „Integration von Videofilterengines auf der ESM-Plattform“ (Diplomarbeit) Prof. Dr. Udo Kebschull, Dipl.-Ing. Norbert Abel, Heidelberg: - Knowledgetransfer Dynamic Partial Reconfiguration AutoVision - 21 © Lehrstuhl für Integrierte Systeme Publications C.Claus, F. H. Müller, W.Stechele: „Combitgen: A new approach for creating partial bitstreams in Virtex-II Pro devices”, ARCS 2006, Dynamically Reconfigurable Systems (DRS) Workshop, Frankfurt W. Stechele: „Video Processing using Reconfigurable Hardware Acceleration for Driver Assistance“ DATE 2006, Future Trends in Automotive Electronics and Tool Integration Workshop, Munich Denchev, R., Stechele, W.: „An Experimentation Environment for MPEG-7 based Driver Assistance“, Eurocon 2005, Belgrade Stechele, W.: „Dynamically Reconfigurable Systems-on-Chip for Video-based Driver Assistance“, Dagstuhl Seminar Proceedings 06141 on Dynamically Reconfigurable Architectures, April 2-7, 2006 C. Claus, H. C. Shin, W.Stechele: „Tunnel Entrance Recognition for video-based Driver Assistance Systems“, submitted to IWSSIP 2006, 13th International Conference on Systems, Signals and Image Processing AutoVision - 22 © Lehrstuhl für Integrierte Systeme Outlook -Achive faster reconfiguration times -reconfigure complete HW accelerators (Engines) -Improve Combitgen (Compression of bitstreams) -Improve Configuration Visualisation -Transfer current system to XUP board -More Engines (ShapeEngine, Lane detection…) -Demonstrator -Use synergies between SPP1148 partners -New cooperations -Defragmentation approach AutoVision - 23 © Lehrstuhl für Integrierte Systeme Defragmentation approach BM Engine1 -Different Engines occupy different amounts of physical resources -PPCs depict non-homogenous areas inside rec. area Engine3 BM PLB BM Engine2 Non-homogenous area BM Engine5 Reconfigurable Area BM Engine4 AutoVision - 24 © Lehrstuhl für Integrierte Systeme Vielen Dank für ihre Aufmerksamkeit! Fragen, Diskussion? Special thanks to: Johannes Zeppenfeld Jian Wang Hoo Chang Shin Florian Müller Carlos Bernal AutoVision - 25 Ingmar Cramm Nicolas Alt © Lehrstuhl für Integrierte Systeme