Teil 2
Transcrição
Teil 2
Beispielproblem: Kürzeste Wege Landkarte -> Graph -> Adjazenzmatrix/ Entfernungsmatrix Hauptbahnhof 1 1 200 Elisabethkirche 2 2 300 150 400 3 3 125 4 5 Parallelität in funktionalen Programmiersprachen 2010 Stadthalle 50 4 100 Alte Aula 5 0 200 300 ∞ ∞ 200 0 150 ∞ 400 300 150 0 50 125 ∞ ∞ 50 0 100 ∞ 400 125 100 0 Mensa Wie lang ist der kürzeste Weg von A nach B für beliebige Knoten A und B? 37 Ring Skelett rowi rowk Warshalls Algorithmus in Prozessring Erzwinge frühzeitige Auswertung von nextrowk ring_iterate :: Int -> Int -> Int -> [Int] -> [[Int]] -> ( [Int], [[Int]]) ring_iterate size k i rowk (rowi:xs) | i > size = (rowk, []) -- Ende der Iterationen | i == k = (rowR, rowk:restoutput) –- sende eigene row | otherwise = (rowR, rowi:restoutput) –- aktualisiere row where (rowR, restoutput) = ring_iterate size k (i+1) nextrowk xs nextrowk | i == k = rowk -- no update, if own row | otherwise = updaterow rowk rowi (rowk!!(i-1)) Parallelität in funktionalen Programmiersprachen 2010 38 Traces zu parallelem Warshall Ende der schnelleren Version Parallelität in funktionalen Programmiersprachen 2010 Mit frühzeitiger Auswertung von nextrowk 39 Divide-and-conquer-Skelette dc :: (a->Bool) -> (a->b) -> (a->[a]) -> ([b]->b) -> a->b dc trivial solve split combine task = if trivial task then solve task else combine (map rec_dc (split task)) where rec_dc = dc trivial solve split combine regular binary scheme with default placing: 1 2 11 11 22 3 3 5 1 4 3 4 6 22 4 7 3 5 5 8 1 4 3 6 4 22 7 4 3 5 8 5 40 Explicit Placement via Ticket List 2, 3, 4, 5, 6, 7, 8 1 unshuffle 3, 5, 7 11 5 11 22 7 3 4, 6, 8 22 6 4 5 8 1 4 5 3 6 7 22 7 6 4 5 8 1 5 4 3 6 7 22 7 6 4 5 8 41 Regular DC-Skeleton with Ticket Placement dcNTickets :: (Trans a, Trans b) => Int -> [Int] -> ... -- branch degree / tickets / ... dcNTickets k [] trivial solve split combine = dc trivial solve split combine dcNTickets k tickets trivial solve split combine x = if trivial x then solve x else childRes `pseq` rnf myRes `pseq` -- demand control combine (myRes:childRes ++ localRess ) where childRes = spawnAt childTickets childProcs procIns childProcs = map (process . rec_dcN) theirTs rec_dcN ts = dcNTickets k ts trivial solve split combine -- ticket distribution (childTickets, restTickets) = splitAt (k-1) tickets (myTs: theirTs) = unshuffle k restTickets -- input splitting (myIn:theirIn) = split x (procIns, localIns) = splitAt (length childTickets) theirIn -- local computations myRes = dcNTickets k myTs trivial solve split combine myIn localRess = map (dc trivial solve split combine) localIns 42 Regular DC-Skeleton with Ticket Placement dcNTickets :: (Trans a, Trans b) => Int -> [Int] -> ... -- branch degree / tickets / ... dcNTickets k [] trivial solve split combine = dc trivial solve split combine dcNTickets k tickets trivial solve split combine x = if trivial x then solve x but fixed branching else • arbitrary, childRes `pseq` rnf myRes `pseq` degree -- demand control combine (myRes:childRes ++ localRess ) flexible,= works where •childRes spawnAt with childTickets childProcs procIns childProcs = map (process . rec_dcN) theirTs • too few tickets rec_dcN ts = dcNTickets k ts trivial solve split combine -- ticket distribution • double tickets (childTickets, restTickets) = splitAt (k-1) tickets theirTs) = unshuffle k restTickets •(myTs: parallel unfolding controlled by ticket list -- input splitting (myIn:theirIn) = split x (procIns, localIns) = splitAt (length childTickets) theirIn -- local computations myRes = dcNTickets k myTs trivial solve split combine myIn localRess = map (dc trivial solve split combine) localIns 43 Case Study: Karatsuba • multiplication of large integers • fixed branching degree 3 • complexity O(nlog2 3), combine complexity O(n) • Platform: LAN (Fast Ethernet), 7 dual-core linux workstations, 2 GB RAM • input size: 2 integers with 32768 digits each 44 Divide-and-Conquer mittels Master-Worker • Bisherige Arbeitsaufteilung: (distributed expansion) • Alternativer Ansatz: (flat expansion) Parallelität in funktionalen Programmiersprachen WS 2010/11 45 Divide-and-Conquer mittels Master-Worker divConRW :: (Trans a,Trans b, Show b, Show a, NFData b) => Int -> (a->Bool) -> (a->b) -> (a->[a]) -> ([b]->b) -> a -> b divConRW depth trivial solve split combine x = combineTopMaster (\_ -> combine) levels results where (tasks,levels) = generateTasks depth trivial split x results = workpool noPe prefetch dcSeq tasks dcSeq = dc trivial solve split combine Parallelität in funktionalen Programmiersprachen WS 2010/11 46 Glasgow Haskell Compiler & Eden Erweiterungen Haskell Eden Lex/Yacc parser prefix form core syntax simplify core to STG STG simplify reader code generation abs. syntax renamer Front end abs. syntax type checker abs. syntax desugarer Parallelität in funktionalen Programmiersprachen WS 2010/11 (parallel Eden runtime system) abstract C flattening C Message Passing Library MPI or PVM C compiler 47 Edens paralleles Laufzeitsystem Modifikation von GUM, dem Laufzeitsystem von GpH (Glasgow Parallel Haskell): • • Wiederverwendung – Threadverwaltung: – Speicherverwaltung: – Kommunikation: Heap Objekte, Thread Scheduler lokale Garbage Collection Graph Pack- und Entpack-Routinen Neuentwicklung – Prozessverwaltung: Laufzeittabellen, Erzeugung und Termination, Lastbalancierung und Arbeitsverteilung – Kommunikationsroutinen: adaptive Kommunikation, Bypassing • Vereinfachungen – kein „virtual shared memory“ (globaler Adreßbereich) – keine Globalisierung unausgewerteter Daten Parallelität in funktionalen Programmiersprachen WS 2010/11 48 Implementierung von Eden • Parallele Programmierung auf hoher Abstraktionsebene – explizite Prozessdefinitionen Eden – implizite Kommunikation ? • Automatische Prozessverwaltung – verteilte Graphreduktion – Verwaltung von Prozessen und Kommunikationen Eden Laufzeitsystem (RTS) Parallelität in funktionalen Programmiersprachen WS 2010/11 49 Ansatz • Parallele Programmierung auf hoher Abstraktionsebene – explizite Prozessdefinitionen – implizite Kommunikation Eden Eden Modul • Automatische Prozessverwaltung – verteilte Graphreduktion – Verwaltung von Prozessen und Kommunikationen Parallelität in funktionalen Programmiersprachen WS 2010/11 Eden RTS 50 Ebenenstruktur Edenprogramme Skelettbibliotheken Eden Modul Primitive Operationen Paralleles GHC Laufzeitsystem Parallelität in funktionalen Programmiersprachen WS 2010/11 51 Das Eden Modul Typklasse Trans • enthält alle übertragbaren Datentypen • überladene Kommunikationsfunktionen für Listen (-> streams): write :: a -> IO () und Tupel: createComm :: IO (ChanName a, a) process :: (Trans a, Trans b) => (a -> b) -> Process a b (#) :: (Trans a, Trans b) => Process a b -> a -> b explizite Definitionen von process, ( # ) und Process Parallelität in funktionalen Programmiersprachen WS 2010/11 explizite Kanäle • newtype ChanName a = Comm (a -> IO ()) 52 Die Typklasse Trans instance (Trans a, Trans b) => class NFData a => Trans a where Trans (a,b) where write :: a -> IO ( ) createComm write x = rnf x `pseq` = do (cx,x) <- createC sendData Data x (cy,y) <- createC createComm :: (ChanName a, a) return createComm (Comm (write2 (cx,cy)), = do(cx, x) <- createC (x,y)) return (Comm (sendVia cx), x) sendVia ch d write2 :: (Trans a, Trans b) => = do { connectToPort ch; write d } (ChanName' a, ChanName' b) -> instance Trans a => Trans [a] where write l@[] = sendData Data l write (x:xs) = do (rnf x `pseq` sendData Stream x) write xs Parallelität in funktionalen Programmiersprachen WS 2010/11 (a,b) -> IO () write2 (c1,c2) (x1,x2) = do fork (sendVia c1 x1) sendVia c2 x2 53 Parprim – Die Schnittstelle zum parallelen Laufzeitsystem Primitive Operationen stellen Basisfunktionalität bereit: • channel administration primitive channels data ChanName' a = Chan Int# Int# Int# create communication channel(s) createC :: IO ( ChanName' a, a ) connect communication channel connectToPort :: ChanName ' a -> IO () • communication send data modi sendData :: Mode -> a -> IO () • thread creation fork :: IO () -> IO () • general noPE, selfPE Parallelität in funktionalen Programmiersprachen WS 2010/11 data Mode = Connect | Stream | Data | Instantiate Int 54 Remote Process Creation data (Trans a, Trans b) => Process a b = Proc (ChanName b -> ChanName‘ (ChanName a) -> IO ( ) ) output channel(s) process channel for returning input channel handle(s) :: (a -> b) -> Process a b process f = Proc f_remote where f_remote (Comm sendResult) inCC = do (sendInput, invals) = createComm connectToPort inCC sendData Data sendInput sendResult (f invals) Parallelität in funktionalen Programmiersprachen WS 2010/11 Process Output 55 Process Instantiation ( # ) :: (Trans a, Trans b) => Process a b -> a -> b pabs # inps = unsafePerformIO $ instantiateAt 0 pabs inps instantiateAt :: (Trans a, Trans b) => Int -> Process a b -> a -> IO b instantiateAt pe (Proc f_remote) inps = do (sendresult, result) <- createComm output channel(s) (inCC, Comm sendInput)<- createC sendData (Instantiate pe) (f_remote sendresult inCC) channel for fork (sendInput inps) returning input return result channel handle(s) -- variant of ( # ) which immediately delivers a whnf data Lift a = Lift a; deLift (Lift x) = x createProcess :: (Trans a, Trans b) => Process a b -> a -> Lift b createProcess p i = unsafePerformIO (instantiateAt 0 p i >>= \x -> return (Lift x)) Parallelität in funktionalen Programmiersprachen WS 2010/11 56