Teil 2

Transcrição

Teil 2
Beispielproblem: Kürzeste Wege
Landkarte
->
Graph -> Adjazenzmatrix/
Entfernungsmatrix
Hauptbahnhof
1
1
200
Elisabethkirche 2
2
300
150
400
3
3
125
4
5
Parallelität in funktionalen Programmiersprachen 2010
Stadthalle
50
4
100
Alte Aula
5
0
200 300 ∞ ∞
200 0 150 ∞ 400
300 150 0 50 125
∞
∞ 50
0 100
∞
400 125 100 0
Mensa
Wie lang ist der kürzeste Weg
von A nach B für beliebige
Knoten A und B?
37
Ring Skelett
rowi
rowk
Warshalls Algorithmus
in Prozessring
Erzwinge frühzeitige
Auswertung von nextrowk
ring_iterate :: Int -> Int -> Int ->
[Int] -> [[Int]] -> ( [Int], [[Int]])
ring_iterate size k i rowk (rowi:xs)
| i > size = (rowk, [])
-- Ende der Iterationen
| i == k
= (rowR, rowk:restoutput) –- sende eigene row
| otherwise = (rowR, rowi:restoutput) –- aktualisiere row
where
(rowR, restoutput) = ring_iterate size k (i+1) nextrowk xs
nextrowk | i == k
= rowk -- no update, if own row
| otherwise = updaterow rowk rowi (rowk!!(i-1))
Parallelität in funktionalen Programmiersprachen 2010
38
Traces zu parallelem Warshall
Ende der
schnelleren
Version
Parallelität in funktionalen Programmiersprachen 2010
Mit
frühzeitiger
Auswertung
von
nextrowk
39
Divide-and-conquer-Skelette
dc :: (a->Bool) -> (a->b) -> (a->[a]) -> ([b]->b) -> a->b
dc trivial solve split combine task
=
if trivial task then solve task
else combine (map rec_dc (split task))
where rec_dc = dc trivial solve split combine
regular binary scheme
with default placing:
1
2
11
11
22
3
3
5
1
4
3
4
6
22
4
7
3
5
5
8
1
4
3
6
4
22
7
4
3
5
8
5
40
Explicit Placement via Ticket List
2, 3, 4, 5, 6, 7, 8
1
unshuffle
3, 5, 7
11
5
11
22
7
3
4, 6, 8
22
6
4
5
8
1
4
5
3
6
7
22
7
6
4
5
8
1
5
4
3
6
7
22
7
6
4
5
8
41
Regular DC-Skeleton with Ticket Placement
dcNTickets :: (Trans a, Trans b) =>
Int -> [Int] -> ...
-- branch degree / tickets / ...
dcNTickets k [] trivial solve split combine
= dc trivial solve split combine
dcNTickets k tickets trivial solve split combine x
= if trivial x then solve x
else
childRes `pseq` rnf myRes `pseq` -- demand control
combine (myRes:childRes ++ localRess )
where childRes = spawnAt childTickets childProcs procIns
childProcs = map (process . rec_dcN) theirTs
rec_dcN ts = dcNTickets k ts trivial solve split combine
-- ticket distribution
(childTickets, restTickets) = splitAt (k-1) tickets
(myTs: theirTs) = unshuffle k restTickets
-- input splitting
(myIn:theirIn) = split x
(procIns, localIns) = splitAt (length childTickets) theirIn
-- local computations
myRes = dcNTickets k myTs trivial solve split combine myIn
localRess = map (dc trivial solve split combine) localIns
42
Regular DC-Skeleton with Ticket Placement
dcNTickets :: (Trans a, Trans b) =>
Int -> [Int] -> ...
-- branch degree / tickets / ...
dcNTickets k [] trivial solve split combine
= dc trivial solve split combine
dcNTickets k tickets trivial solve split combine x
= if trivial x then solve x
but fixed
branching
else • arbitrary,
childRes `pseq`
rnf myRes
`pseq` degree
-- demand control
combine (myRes:childRes ++ localRess )
flexible,= works
where •childRes
spawnAt with
childTickets childProcs procIns
childProcs = map (process . rec_dcN) theirTs
• too few tickets
rec_dcN ts = dcNTickets k ts trivial solve split combine
-- ticket distribution
• double tickets
(childTickets, restTickets) = splitAt (k-1) tickets
theirTs)
= unshuffle
k restTickets
•(myTs:
parallel
unfolding
controlled
by ticket list
-- input splitting
(myIn:theirIn) = split x
(procIns, localIns) = splitAt (length childTickets) theirIn
-- local computations
myRes = dcNTickets k myTs trivial solve split combine myIn
localRess = map (dc trivial solve split combine) localIns
43
Case Study: Karatsuba
• multiplication of large integers
• fixed branching degree 3
• complexity O(nlog2 3),
combine complexity O(n)
• Platform: LAN (Fast Ethernet),
7 dual-core linux workstations,
2 GB RAM
• input size: 2 integers with
32768 digits each
44
Divide-and-Conquer mittels Master-Worker
• Bisherige Arbeitsaufteilung:
(distributed expansion)
• Alternativer Ansatz:
(flat expansion)
Parallelität in funktionalen Programmiersprachen WS 2010/11
45
Divide-and-Conquer mittels Master-Worker
divConRW :: (Trans a,Trans b, Show b, Show a, NFData b) =>
Int -> (a->Bool) -> (a->b) -> (a->[a]) -> ([b]->b) -> a -> b
divConRW depth trivial solve split combine x
= combineTopMaster (\_ -> combine) levels results
where
(tasks,levels) = generateTasks depth trivial split x
results
= workpool noPe prefetch dcSeq tasks
dcSeq
= dc trivial solve split combine
Parallelität in funktionalen Programmiersprachen WS 2010/11
46
Glasgow Haskell Compiler
& Eden Erweiterungen
Haskell Eden
Lex/Yacc parser
prefix form
core syntax
simplify
core to STG
STG
simplify
reader
code generation
abs. syntax
renamer
Front
end
abs.
syntax
type checker
abs. syntax
desugarer
Parallelität in funktionalen Programmiersprachen WS 2010/11
(parallel Eden
runtime system)
abstract C
flattening
C
Message
Passing
Library
MPI or PVM
C compiler
47
Edens paralleles Laufzeitsystem
Modifikation von GUM, dem Laufzeitsystem von GpH (Glasgow Parallel Haskell):
•
•
Wiederverwendung
– Threadverwaltung:
– Speicherverwaltung:
– Kommunikation:
Heap Objekte, Thread Scheduler
lokale Garbage Collection
Graph Pack- und Entpack-Routinen
Neuentwicklung
– Prozessverwaltung:
Laufzeittabellen, Erzeugung und Termination,
Lastbalancierung und Arbeitsverteilung
– Kommunikationsroutinen: adaptive Kommunikation, Bypassing
•
Vereinfachungen
– kein „virtual shared memory“ (globaler Adreßbereich)
– keine Globalisierung unausgewerteter Daten
Parallelität in funktionalen Programmiersprachen WS 2010/11
48
Implementierung von Eden
• Parallele Programmierung auf hoher
Abstraktionsebene
– explizite Prozessdefinitionen
Eden
– implizite Kommunikation
?
• Automatische Prozessverwaltung
– verteilte Graphreduktion
– Verwaltung von Prozessen
und Kommunikationen
Eden
Laufzeitsystem
(RTS)
Parallelität in funktionalen Programmiersprachen WS 2010/11
49
Ansatz
• Parallele Programmierung auf hoher
Abstraktionsebene
– explizite Prozessdefinitionen
– implizite Kommunikation
Eden
Eden Modul
• Automatische Prozessverwaltung
– verteilte Graphreduktion
– Verwaltung von Prozessen
und Kommunikationen
Parallelität in funktionalen Programmiersprachen WS 2010/11
Eden RTS
50
Ebenenstruktur
Edenprogramme
Skelettbibliotheken
Eden Modul
Primitive Operationen
Paralleles GHC Laufzeitsystem
Parallelität in funktionalen Programmiersprachen WS 2010/11
51
Das Eden Modul
Typklasse Trans
• enthält alle übertragbaren Datentypen
• überladene Kommunikationsfunktionen
für Listen (-> streams): write :: a -> IO ()
und Tupel: createComm :: IO (ChanName a, a)
process :: (Trans a, Trans b) => (a -> b) -> Process a b
(#)
:: (Trans a, Trans b) => Process a b
-> a -> b
explizite Definitionen
von process, ( # )
und Process
Parallelität in funktionalen Programmiersprachen WS 2010/11
explizite Kanäle
• newtype ChanName a
= Comm (a -> IO ())
52
Die Typklasse Trans
instance (Trans a, Trans b) =>
class NFData a => Trans a where
Trans (a,b) where
write
:: a -> IO ( )
createComm
write x = rnf x `pseq`
= do (cx,x) <- createC
sendData Data x
(cy,y) <- createC
createComm :: (ChanName a, a)
return
createComm
(Comm (write2 (cx,cy)),
= do(cx, x) <- createC
(x,y))
return (Comm (sendVia cx), x)
sendVia ch d
write2 :: (Trans a, Trans b) =>
= do { connectToPort ch; write d }
(ChanName' a, ChanName' b) ->
instance Trans a => Trans [a] where
write l@[] = sendData Data l
write (x:xs)
= do (rnf x `pseq` sendData Stream x)
write xs
Parallelität in funktionalen Programmiersprachen WS 2010/11
(a,b) -> IO ()
write2 (c1,c2) (x1,x2)
= do fork (sendVia c1 x1)
sendVia c2 x2
53
Parprim – Die Schnittstelle zum parallelen
Laufzeitsystem
Primitive Operationen stellen Basisfunktionalität bereit:
• channel administration
primitive channels
data ChanName' a = Chan Int# Int# Int#
create communication channel(s) createC :: IO ( ChanName' a, a )
connect communication channel connectToPort :: ChanName ' a -> IO ()
• communication
send data
modi
sendData :: Mode -> a -> IO ()
• thread creation
fork :: IO () -> IO ()
• general
noPE, selfPE
Parallelität in funktionalen Programmiersprachen WS 2010/11
data Mode = Connect | Stream
| Data | Instantiate Int
54
Remote Process Creation
data (Trans a, Trans b) =>
Process a b = Proc (ChanName b -> ChanName‘ (ChanName a) -> IO ( ) )
output channel(s)
process
channel for
returning input
channel handle(s)
:: (a -> b) -> Process a b
process f = Proc f_remote
where f_remote (Comm sendResult) inCC
= do (sendInput, invals) = createComm
connectToPort inCC
sendData Data sendInput
sendResult (f invals)
Parallelität in funktionalen Programmiersprachen WS 2010/11
Process Output
55
Process Instantiation
( # ) :: (Trans a, Trans b) => Process a b -> a -> b
pabs # inps = unsafePerformIO $ instantiateAt 0 pabs inps
instantiateAt :: (Trans a, Trans b) => Int -> Process a b -> a -> IO b
instantiateAt pe (Proc f_remote) inps
= do (sendresult, result)
<- createComm
output channel(s)
(inCC, Comm sendInput)<- createC
sendData (Instantiate pe) (f_remote sendresult inCC)
channel for
fork (sendInput inps)
returning input
return result
channel handle(s)
-- variant of ( # ) which immediately delivers a whnf
data Lift a = Lift a;
deLift (Lift x) = x
createProcess :: (Trans a, Trans b) => Process a b -> a -> Lift b
createProcess p i = unsafePerformIO (instantiateAt 0 p i >>= \x -> return (Lift x))
Parallelität in funktionalen Programmiersprachen WS 2010/11
56

Documentos relacionados