Autovision - ein rekonfigurierbares Rechensystem für

Transcrição

Autovision - ein rekonfigurierbares Rechensystem für
Autovision - ein rekonfigurierbares
Rechensystem für Video-basierte
Fahrerassistenzsysteme im Automobil
Christopher Claus
Walter Stechele
Andreas Herkersdorf
June, 2nd 2006
Technische Universität
80290 München
Theresienstr. 90
Building N1, 2nd floor
www.lis.ei.tum.de
Overview
-Motivation of Autovision
-Autovision & Coprocessors
-
Video Interface ML310
Basic Pixel Processing Engine
Edge Engine
Contrast Engine
Luminanz Engine
Tunnel Entrance Detection
-
Reconfiguration times
Example Systems
Early Access Partial Reconfiguration (EAPR)
Combitgen
Current Problems & possible solutions
-Dynamic Partial Reconfiguration
-Cooperations & Publications
-Outlook
AutoVision - 2
© Lehrstuhl für
Integrierte Systeme
AutoVision Processor
Xilinx Virtex-II Pro
Region
to
enhance
contrast
PPC
mem
Engine
PPC
Mem I/F
Classifier
Engine
Luminanz
Engine
Shape
Engine
xxxx
Engine
Video I/F
Bus
mem
Engine
mem
mem
reconfig
Shape
Engine
Highway
Tunnel
entrance
Inside
tunnel
AutoVision - 3
Contrast
Engine
Edge
Engine
Luminanz
Engine
X
PPC
X
X
X
X
X
X
© Lehrstuhl für
Integrierte Systeme
Video interface
System Overview -Hardware Setup
6MBit
= 2,44 Frames
640 * 480 Pixel
* 8 Bit
Frame
Pixel
SW
Video
4
Linux
driver
webcam
driver
framegrabber
AutoVision - 4
© Lehrstuhl für
Integrierte Systeme
Basic Pixel Processing Engine
PLB
Config.
Registers
Input
Transport
IPIF
Cache
Memory
Arbiter
Matrix
FIFO
User
Process
ContrastEng
EdgeEng
AutoVision - 5
© Lehrstuhl für
Integrierte Systeme
Edge Engine
-Edge Detection
-Lane Detection (Hough transformation)
-Detection and labeling of horizontal lines -> obstacles
AutoVision - 6
© Lehrstuhl für
Integrierte Systeme
Edge Engine Performance
EdgeEngine
2.389 ms
Edge
PPC
PPC
PPC
Edge-Filtered Output
QVGA Grayscale Input
320 x 240 Pixels
8 bit/pixel
26.590 ms
PPC
AutoVision - 7
PPC
© Lehrstuhl für
Integrierte Systeme
Contrast Engine
Region to enhance
contrast
Sample window to get
Ymax and Ymin
255
255
Ymax
Ymin
0
AutoVision - 8
© Lehrstuhl für
Integrierte Systeme
0
Luminanz Engine
- Detection of pairs of lights, thresholding and finding regions (in HW)
- Grouping of lights, detection of license plates and classification done in SW (flexibility)
AutoVision - 9
© Lehrstuhl für
Integrierte Systeme
Tunnel Entrance Detection
-Algorithm searches for tunnel entrances (black holes)
-Lane Detection to verify that found entrance is correct
-Mark found tunnel as Region of interest
-Currently implemented in Matlab –> future HW version with System
Generator
AutoVision - 10
© Lehrstuhl für
Integrierte Systeme
Reducing Reconfiguration Times
Motivation: 40ms to process one image.
If image processing can be done in 35 ms
-> replace engines in 5 ms -> no frames dropped
Possibilities to shorten rec. times
1. # Frames to be written ~ rec. Time (Combitgen)
2. How are frames written to config. memory
-> reduce overhead (Combitgen)
3. Memory Access speed -> Steffen Toscher, Magdeburg
AutoVision - 11
© Lehrstuhl für
Integrierte Systeme
Example: Adder/Subtractor
Adder/
Subtractor
Frames
to be
recon.
I
C
A
P
-Small Program running on PPC is writing
input values over OPB to registers.
-Bitstreams are stored on CompactFlash card.
-> loaded into DDR during Initialization
-Internal Configuration Access Port (ICAP) handels
the reconfiguration from ‚inside‘. Bitstreams are
loaded from DDR -> ICAP -> Reconfiguration.
-Adder can be replaced by subtractor during
runtime and vice versa while static part remains
operational.
-Disadvantage: routes that cross reconfigurable
area -> busmacros have to be used -> Timing
problems
AutoVision - 12
© Lehrstuhl für
Integrierte Systeme
What is EAPR
The requirement for wholecolumn PR regions is removed.
The Early Access Partial Reconfiguration (EAPR) flow now
allows for PR regions of any rectangular size
EA PR flow allows signals (routes) in the base design to
cross through a partially reconfigurable region without
using a bus macro.
static
rec.
static
BM
BM
AutoVision - 13
© Lehrstuhl für
Integrierte Systeme
Example EAPR Design
-No more busmacros for connections between
static areas
- Complete peripherals are exchanged
-Timing advantage
- Design handcrafted /
- DCMs must be instatiated on toplevel manually
- Automize and optimize EAPR flow
- Use Combitgen as add-on for EAPR Flow
AutoVision - 14
© Lehrstuhl für
Integrierte Systeme
Configuration Details (1)
Frame #1
Frame #2
Regular configuration over FDRI:
-write operation pipelined
through the frame buffer
-Every write process ends with
one frame of pad data
(206 x 32 bit)
Source: VirtexII-Pro and VirtexII-Pro X FPGA User Guide
AutoVision - 15
© Lehrstuhl für
Integrierte Systeme
Configuration Details (2)
Configuration over MFWR:
-Write one frame to multiple
addresses
BRAM
CLBS
Identical frame
BRAM
Identical frame
CLBS
Identical frame
-Every write process needs 13
x 32 bit words
-Single frames -> regular method
identical frames -> MFWR
-Single frames -> MFWR
identical frames -> MFWR
MFWR Register
AutoVision - 16
© Lehrstuhl für
Integrierte Systeme
Combitgen
Toplevel
Bitstreams
1
2
Partial
Bitstreams
PseudoFPGA-Memory
1
1
=
2
3
3
2
3
Differential
Frame-Bitmap
- CRC calculation
- Single MFWR
AutoVision - 17
© Lehrstuhl für
Integrierte Systeme
Adder/Subtractor Results
Rek.
Flow
From
slice #
To slice
#
Size in
Byte
Frames
to write
Rec.Time
(ms)
61,9
Partial
Mask
70
80
65325
218
Part. M.
compr.
70
80
65325
218
Diff.
Based
70
Combitgen
70
Factor
3,2
80
Factor
10
27578
22
Factor
3,2
61,9
26,1
Factor
1,4
80
Factor
1,4
20306
22
19,2
EAPR bitstream (bytes) Combitgen bitstream (bytes)
Adder
222.301
Subtractor
218.285
Factor
2.3
AutoVision - 18
95.774
93.466
© Lehrstuhl für
Integrierte Systeme
Current problems
-
ML310 linux-PCI base design not suited for the EAPR flow
PCI occupies almost one third of XC2VP30
3 fps instead of 25 fps due to USB 1.1 support
Not possible to run two engines in parallel due to phys. Resources
alternative: ESM
AutoVision - 19
© Lehrstuhl für
Integrierte Systeme
Possible Solutions (XUP Board)
XSGA Video
Ouput
Video decoder
Board
-
Same FPGA
Every V2P EAPR Design was implemented on this board
25 fps possible through Video decoder board
On-Board XSGA Video output, no need for PCI, more phys. resources
AutoVision - 20
© Lehrstuhl für
Integrierte Systeme
Cooperations
Prof. Dr.-Ing. Jürgen Becker, Dipl.-Ing. Michael Hübner, Karlsruhe:
- knowledgetransfer, student exchange, common activities
- „On-Line und On-Chip Visualisierung von dynamischer und partieller
Rekonfiguration“
- Generating compressed bitstreams with Combitgen and use decompressor unit
from KA
Prof. Dr. Roland Kasper, Dipl.-Ing. Steffen Toscher, Magdeburg:
- „Methoden für die Hardware-ICAP Ansteuerung“
direct connection between Magdeburg ICAP and main memory to speed up
reconfiguration times
Prof. Dr.-Ing. Jürgen Teich, Dipl.-Ing. Mateusz Majer, Erlangen:
- „Integration von Videofilterengines auf der ESM-Plattform“ (Diplomarbeit)
Prof. Dr. Udo Kebschull, Dipl.-Ing. Norbert Abel, Heidelberg:
- Knowledgetransfer Dynamic Partial Reconfiguration
AutoVision - 21
© Lehrstuhl für
Integrierte Systeme
Publications
C.Claus, F. H. Müller, W.Stechele: „Combitgen: A new approach for creating
partial bitstreams in Virtex-II Pro devices”, ARCS 2006, Dynamically
Reconfigurable Systems (DRS) Workshop, Frankfurt
W. Stechele: „Video Processing using Reconfigurable Hardware Acceleration for
Driver Assistance“ DATE 2006, Future Trends in Automotive Electronics and
Tool Integration Workshop, Munich
Denchev, R., Stechele, W.: „An Experimentation Environment for MPEG-7 based
Driver Assistance“, Eurocon 2005, Belgrade
Stechele, W.: „Dynamically Reconfigurable Systems-on-Chip for
Video-based Driver Assistance“, Dagstuhl Seminar Proceedings 06141 on
Dynamically Reconfigurable Architectures, April 2-7, 2006
C. Claus, H. C. Shin, W.Stechele: „Tunnel Entrance Recognition for video-based
Driver Assistance Systems“, submitted to IWSSIP 2006, 13th International
Conference on Systems, Signals and Image Processing
AutoVision - 22
© Lehrstuhl für
Integrierte Systeme
Outlook
-Achive faster reconfiguration times
-reconfigure complete HW accelerators (Engines)
-Improve Combitgen (Compression of bitstreams)
-Improve Configuration Visualisation
-Transfer current system to XUP board
-More Engines (ShapeEngine, Lane detection…)
-Demonstrator
-Use synergies between SPP1148 partners
-New cooperations
-Defragmentation approach
AutoVision - 23
© Lehrstuhl für
Integrierte Systeme
Defragmentation approach
BM Engine1
-Different Engines occupy different amounts of
physical resources
-PPCs depict non-homogenous areas inside rec. area
Engine3 BM
PLB
BM Engine2
Non-homogenous
area
BM Engine5
Reconfigurable Area
BM Engine4
AutoVision - 24
© Lehrstuhl für
Integrierte Systeme
Vielen Dank für ihre Aufmerksamkeit!
Fragen, Diskussion?
Special thanks to:
Johannes Zeppenfeld
Jian Wang
Hoo Chang Shin
Florian Müller
Carlos Bernal
AutoVision - 25
Ingmar Cramm
Nicolas Alt
© Lehrstuhl für
Integrierte Systeme