Migration
Transcrição
Migration
ZKI AK-Supercomputing - 6./7.10.2011 Jena Datenmanagement mit dem IBM General Parallel File System Karsten Kutzer (kutzer @ de.ibm.com) IT Architect Deep Computing © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System Agenda Datenmanagement – aus Sicht von Benutzern und Administratoren Ein kurzer Überblick über GPFS Datenmanagement in der Praxis – Lifecycle Management der Speicherinfrastruktur – Management der Benutzerdaten 2 ZKI AK-Supercomputing - 6./7.10.2011 Jena © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System Agenda Datenmanagement – aus Sicht von Benutzern und Administratoren Ein kurzer Überblick über GPFS Datenmanagement in der Praxis – Lifecycle Management der Speicherinfrastruktur – Management der Benutzerdaten 3 ZKI AK-Supercomputing - 6./7.10.2011 Jena © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System Datenmanagement – verschiedene Sichtweisen Benutzer Client Client ... Client IO Netzwerk Administrator Server Server Platte 4 ZKI AK-Supercomputing - 6./7.10.2011 Jena Backup Band © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System Datenmanagement – wie es in der Praxis oft aussieht Scratch Home Kopieren Archiv Kopieren Benutzer IO Netzwerk Backup Administrator Band 5 Server Server Platte ZKI AK-Supercomputing - 6./7.10.2011 Jena Kopieren Server Server Platte Kopieren Server Server Platte Band © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System Agenda Datenmanagement – aus Sicht von Benutzern und Administratoren Ein kurzer Überblick über GPFS Datenmanagement in der Praxis – Lifecycle Management der Speicherinfrastruktur – Management der Benutzerdaten 6 ZKI AK-Supercomputing - 6./7.10.2011 Jena © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System Die Historie von GPFS IBM SAN File System (SFS) SFS v1.0 SFS v1.1 +Windows 2008R2 1.1 lc 1.2 lc 1.3 lc 2.2 lc 3.3 3.4+ Win2008R2 2.2 2.3 3.1 3.2 3.3 3.4+ Linux 2.2 2.2 2.3 3.1 3.2 3.3 3.4+ pLinux lc 2.2 2.3 3.1 3.2 3.3 3.4+ AIX5/6/7 Tiger Shark 3.2 lc 1.x sp 1.4 1.5 2.1 2.2 sp hacmp (SSA) sp hacmp (ESS) sp hacmp rpd sp hacmp rpd lc + Information Lifecycle Management (ILM) + Remote mount capabilities (WAN) + Interoperability Disaster Recovery (DR) 1993 7 1998 2000 2001 2002 ZKI AK-Supercomputing - 6./7.10.2011 Jena 2003 2004 2005 2006 2007 2008 2009 2010 2011 © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System GPFS Überblick GPFS ermöglicht: Multi Storage Tier Support Einen “Global Namespace” Über Platformen (AIX, Linux, Windows 2008) Eliminierung von Datenkopien Optimierte Speicherauslastung Einfaches Management Skalierbare Bandbreiten Voller WAN Support Storage Tier Unterstützung HSM Unterstützung Schnelles Backup 8 ZKI AK-Supercomputing - 6./7.10.2011 Jena Databases Datenpfad SAN TCP/IP InfiniBand Management Centralized Monitoring Automated File Mgmt Verfügbarkeit Data Migration Replication Backup File Servers Backup / Archive Application Servers © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System GPFS Architekturen 1/3 NSD SAN LUN (NSD = Network Shared Disk) 9 ZKI AK-Supercomputing - 6./7.10.2011 Jena © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System GPFS Architekturen 2/3 NSD Clients IO Netzwerk NSD Server SAN LUNs (NSD = Network Shared Disk) 10 ZKI AK-Supercomputing - 6./7.10.2011 Jena © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System GPFS Architekturen 3/3 NSD Clients IO Netzwerk (LAN / WAN / IB) NSD Server SAN LUN‘s (NSD = Network Shared Disk) 11 ZKI AK-Supercomputing - 6./7.10.2011 Jena © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System Agenda Datenmanagement – aus Sicht von Benutzern und Administratoren Ein kurzer Überblick über GPFS Datenmanagement in der Praxis – Lifecycle Management der Speicherinfrastruktur – Management der Benutzerdaten 12 ZKI AK-Supercomputing - 6./7.10.2011 Jena © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System Generationen von Supercomputern JUMP - Juelich Multi Processor 1312 Prozessoren 5 TB Hauptspeicher 5,6 TFLOPS Peak 2004 JUBL - Jülich BlueGene/L 16.384 Prozessoren 4,1 TB Hauptspeicher 45,87 TFLOPS Peak 2007 JUGENE – Jülich BlueGene/P 294.912 Prozessoren 144 TB Hauptspeicher ~1 PFLOPS peak 2009 2011 Disk-Kapazität ~0,04 PB ~0,9 PB ~ 4,2 PB Datenmenge ~0,1 PB ~0,6 PB ~ 4 PB IO Netzwerk SP-Switch 2 1 GbE 10 GbE IO Hardware POWER4/DS4500 POWER5/DS4700+DCS9550 POWER6/DS5300 Quelle: http://www2.fz-juelich.de/jsc/service/configuration/ 13 ZKI AK-Supercomputing - 6./7.10.2011 Jena © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System Migration von Daten bei Hardware-Technologiewechsel unter Beibehaltung der IO Netzwerk Technologie Client Client ... Client IO Netzwerk ALT ALT ALT 14 ZKI AK-Supercomputing - 6./7.10.2011 Jena © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System Migration von Daten bei Hardware-Technologiewechsel unter Beibehaltung der IO Netzwerk Technologie Client Client ... 1. Neue Storage Building Blocks aufbauen 2. Neue NSD Server im Cluster integrieren Client IO Netzwerk ALT ALT ALT 15 NEU NEU NEU ZKI AK-Supercomputing - 6./7.10.2011 Jena © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System Migration von Daten bei Hardware-Technologiewechsel unter Beibehaltung der IO Netzwerk Technologie Client Client ... 1. Neue Storage Building Blocks aufbauen 2. Neue NSD Server im Cluster integrieren 3. Daten online migrieren Client IO Netzwerk ALT ALT ALT 16 NEU Migration NEU NEU ZKI AK-Supercomputing - 6./7.10.2011 Jena © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System Migration von Daten bei Hardware-Technologiewechsel unter Beibehaltung der IO Netzwerk Technologie Client Client ... 1. 2. 3. 4. 5. Client IO Netzwerk NEU Neue Storage Building Blocks aufbauen Neue NSD Server im Cluster integrieren Daten online migrieren Alte NSD Server aus dem Cluster nehmen Alte Storage Building Blocks abbauen NEU „Unsichtbar“ für den Benutzerbetrieb NEU 17 ZKI AK-Supercomputing - 6./7.10.2011 Jena © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System Agenda Datenmanagement – aus Sicht von Benutzern und Administratoren Ein kurzer Überblick über GPFS Datenmanagement in der Praxis – Lifecycle Management der Speicherinfrastruktur – Management der Benutzerdaten 18 ZKI AK-Supercomputing - 6./7.10.2011 Jena © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System Aktuelle Speicherinfrastruktur 19 ZKI AK-Supercomputing - 6./7.10.2011 Jena © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System Datenwachstum Datenwachstum je Monat > Faktor 5x in 3 Jahren Quelle: Forschungszentrum Jülich 20 ZKI AK-Supercomputing - 6./7.10.2011 Jena © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System Gesamtdatenvolumen – incl. Hierarchischem Storage Management (HSM) 21 ZKI AK-Supercomputing - 6./7.10.2011 Jena © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System GPFS Data Placement und Lifecycle Policies c Placement Policies, beim Anlegen von Dateien: Lifecycle Location Management – einfach, scriptbar Die Scan-und-Migration Engine für General Parallel File System identifiziert nach vorgegebenen Regeln, ÆVerlauf rule VMWare set pool SAS for fileset VMWare wo Daten im des Lebenszyklus gespeichert werden sollen. Æ rule otherfiles set pool NL-SAS Placement policies, evaluated at file creation: policies, evaluated periodically Policies, periodisch angewandt d Migration c Æ rule Æ rule VMWare set pool SAS for fileset VMWare CleanSAS migrate from pool SAS threshold Æ rule Æ rule CleanSAS otherfiles set migrate pool NL-SAS from pool SAS threshold to pool NL-SAS (90,70) Migration policies, evaluated periodically Æ rule CleanNL-SAS CleanSAS migrate from pool SAS threshold (90,70) to=pool when day_of_week() monday d Æ rule NL-SAS migrate pool to= pool TAPE from where Æ rule from CleanNL-SAS when NL-SAS day_of_week() monday migrate pool NL-SAS to pool TAPE where access_age > 60 days access_age > 60 days e Deletion policies, evaluated periodically e Æ rule CleanTAPE when day_of_month() =angewandt 1 delete from pool TAPE Deletion Policies, periodisch where access_age > 365 days Æ rule CleanTAPE when day_of_month() = 1 delete from pool TAPE where access_age > 365 Tier1 days Tier2 Tier3 22 ZKI AK-Supercomputing - 6./7.10.2011 Jena © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System GPFS Policy Scan Engine /home /home/appl/data/web/important_big_spreadsheet.xls /appl /home/appl/data/web/big_architecture_drawing.ppt /data /home/appl/data/web/unstructured_big_video.mpg /web 1. Starte Scan IBM GPFS Global Namespace Policy Engine 2. Lese Policies 3. Paralleler Scan Storage nodes 4. Rückgabe der Ergebnisse Storage nodes Storage nodes Tier 1 23 23 ZKI AK-Supercomputing - 6./7.10.2011 Jena Storage nodes Tier 2 Storage nodes Storage nodes Tier 3 © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System GPFS Policy Scan Engine – Migration /home/appl/data/web/important_big_spreadsheet.xls /home /home/appl/data/web/big_architecture_drawing.ppt /appl /data /home/appl/data/web/unstructured_big_video.mpg /web Alle Dateien bleiben in der Verzeichnisstruktur unverändert sichtbar! IBM GPFS 5. Bearbeite Ergebnisse 6. Bewegen der Daten in andere Tiers oder ins HSM Global Namespace Policy Engine Storage nodes Storage nodes Storage nodes Tier 1 24 ZKI AK-Supercomputing - 6./7.10.2011 Jena Storage nodes Tier 2 Storage nodes Storage nodes Tier 3 © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System Regelwerk für HSM Migrationen > cat mmpolicyRules-hsm define( STUB_SIZE, 0 ) define( is_premigrated,(MISC_ATTRIBUTES LIKE '%M%' AND KB_ALLOCATED > STUB_SIZE)) define( access_age, (DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME))) define( mb_allocated, (INTEGER(KB_ALLOCATED / 1024))) define( weight_expression, (CASE WHEN mb_allocated < 1 WHEN access_age < 14 WHEN is_premigrated ELSE END)) THEN 0 THEN access_age THEN KB_ALLOCATED * access_age mb_allocated * access_age RULE EXTERNAL POOL 'hsm' EXEC '/var/mmfs/etc/mmpolicyExec-hsm.sample' OPTS '-v' /* Exclude .SpaceMan */ RULE 'exclude_spaceman' EXCLUDE WHERE PATH_NAME LIKE '%/.SpaceMan/%' RULE 'thresholdMigration' MIGRATE FROM POOL 'system' THRESHOLD(61,60,60) WEIGHT(weight_expression) TO POOL 'hsm' 25 ZKI AK-Supercomputing - 6./7.10.2011 Jena © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System Dauer einer HSM Migration Starting TSM migration of /arch: Mon Sep 13 02:46:26 MESZ 2010 ... […] Directories scan: 29963618 files, 681270 directories, 400413 other objects, 0 'skipped' files and/or errors. Inodes scan: 29963618 files, 681270 directories, 400413 other objects, 0 'skipped' files and/or errors. Summary of Rule Applicability and File Choices: Rule# Hit_Cnt Chosen KB_Chosen KB_Ill Rule 0 72 0 0 0 RULE 'exclude_spaceman' EXCLUDE WHERE(.) 1 29963546 1397 13672738624 0 RULE 'thresholdMigration' MIGRATE FROM POOL 'system' THRESHOLD(61,60,60) WEIGHT(.) TO POOL 'hsm' + + 158285 223055168 0 (ALREADY CO-MANAGED) Files with no applicable rules: 1081683. GPFS Policy Decisions and File Choice Totals: Chose to migrate 13672738624KB: 1397 of 29963546 candidates; Chose to premigrate 0KB: 0 candidates; Already co-managed 223055168KB: 158285 candidates; Chose to delete 0KB: 0 of 0 candidates; 0KB of chosen data is illplaced or illreplicated; Predicted Data Pool Utilization in KB and %: system 103087409344 171817631744 59.998155% […] IBM Tivoli Storage Manager Command Line Space Management Client Interface Client Version 5, Release 5, Level 2.6 Client date/time: 09/13/10 02:15:11 (c) Copyright by IBM Corporation and other(s) 1990, 2009. All Rights Reserved. […] Finished TSM migration of /arch: Mon Sep 13 11:19:03 MESZ 2010 26 ZKI AK-Supercomputing - 6./7.10.2011 Jena 1397 von 29,9 Mio Dateien selektiert 12,7 TB migriert 08:33 h (zum Vergleich: ein klassischer dsmc Backup benötigt ~30h) © 2011 IBM Corporation Datenmanagement mit dem IBM General Parallel File System Agenda Datenmanagement – aus Sicht von Benutzern und Administratoren Ein kurzer Überblick über GPFS Datenmanagement in der Praxis – Lifecycle Management der Speicherinfrastruktur – Management der Benutzerdaten 27 ZKI AK-Supercomputing - 6./7.10.2011 Jena © 2011 IBM Corporation Datenmanagement mit IBM General Parallel File System 28 © 2011 IBM Corporation Datenmanagement mit IBM General Parallel File System Special Notices Copyright IBM Corporation, 2011 This presentation was produced in the United States. IBM may not offer the products, programs, services or features discussed herein in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the products, programs, services, and features available in your area. Any reference to an IBM product, program, service or feature is not intended to state or imply that only IBM's product, program, service or feature may be used. Any functionally equivalent product, program, service or feature that does not infringe on any of IBM's intellectual property rights may be used instead of the IBM product, program, service or feature. The [e(logo)server] brand consists of the established IBM e-business logo followed by the descriptive term "server". Information in this presentation concerning non-IBM products was obtained from the suppliers of these products, published announcement material or other publicly available sources. Sources for non-IBM list prices and performance numbers are taken from publicly available information including D.H. Brown, vendor announcements, vendor WWW Home Pages, SPEC Home Page, GPC (Graphics Processing Council) Home Page and TPC (Transaction Processing Performance Council) Home Page. IBM has not tested these products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM may have patents or pending patent applications covering subject matter in this presentation. The furnishing of this presentation does not give you any license to these patents. Send license inquires, in writing, to IBM Director of Licensing, IBM Corporation, New Castle Drive, Armonk, NY 10504-1785 USA. All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. Contact your local IBM office or IBM authorized reseller for the full text of a specific Statement of General Direction. The information contained in this presentation has not been submitted to any formal IBM test and is distributed "AS IS". While each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. The use of this information or the implementation of any techniques described herein is a customer responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. Customers attempting to adapt these techniques to their own environments do so at their own risk. IBM is not responsible for printing errors in this presentation that result in pricing or information inaccuracies. The information contained in this presentation represents the current views of IBM on the issues discussed as of the date of publication. IBM cannot guarantee the accuracy of any information presented after the date of publication. All prices shown are IBM's suggested list prices; dealer prices may vary. IBM products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply. Information about non-IBM products was obtained from suppliers of those products. IBM makes no representations or warranties regarding these products. Non-IBM products are offered and warranted by third-parties, not IBM. 29 © 2011 IBM Corporation Datenmanagement mit IBM General Parallel File System Special Notices (cont.) Information provided in this presentation and information contained on IBM's past and present Year 2000 Internet Web site pages regarding products and services offered by IBM and its subsidiaries are "Year 2000 Readiness Disclosures" under the Year 2000 Information and Readiness Disclosure Act of 1998, a U.S statute enacted on October 19, 1998. IBM's Year 2000 Internet Web site pages have been and will continue to be our primary mechanism for communicating year 2000 information. Please see the "legal" icon on IBM's Year 2000 Web site (www.ibm.com/year2000) for further information regarding this statute and its applicability to IBM. Any performance data contained in this presentation was determined in a controlled environment. Therefore, the results obtained in other operating environments may vary significantly. Some measurements quoted in this presentation may have been made on development-level systems. There is no guarantee these measurements will be the same on generally-available systems. Some measurements quoted in this presentation may have been estimated through extrapolation. Actual results may vary. Users of this presentation should verify the applicable data for their specific environment. The following terms are registered trademarks of International Business Machines Corporation in the United States and/or other countries: AIX, AIXwindows, AS/400, C Set++, CICS, CICS/6000, DataHub, DataJoiner, DB2, DEEP BLUE, DYNIX, DYNIX/ptx, e(logo), ESCON, IBM, IBM(logo), Information Warehouse, Intellistation, IQ-Link, LANStreamer, LoadLeveler, Magstar, MediaStreamer, Micro Channel, MQSeries, Net.Data, Netfinity, NUMA-Q, OS/2, OS/390, OS/400, Parallel Sysplex, PartnerLink, PartnerWorld, POWERparallel, PowerPC, PowerPC(logo), ptx/ADMIN, RISC System/6000, RS/6000, S/390, Scalable POWERparallel Systems, SecureWay, Sequent, SP2, System/390, The Engines of e-business, ThinkPad, Tivoli(logo), TURBOWAYS, VisualAge, WebSphere. The following terms are trademarks of International Business Machines Corporation in the United States and/or other countries: AIX/L, AIX/L(logo), AIX 5L, AIX PVMe, Application Region Manager, AS/400e, Blue Gene, Chipkill, ClusterProven, DB2 OLAP Server, DB2 Universal Database, e(logo)business, GigaProcessor, HACMP/6000, Intelligent Miner, iSeries, Network Station, NUMACenter, PowerPC Architecture, PowerPC 604, POWER2 Architecture, pSeries, Sequent (logo), SequentLINK, Service Director, Shark, SmoothStart, SP, Tivoli Enterprise, TME 10, Videocharger, Visualization Data Explorer, xSeries, zSeries. A full list of U.S. trademarks owned by IBM may be found at http://iplswww.nas.ibm.com/wpts/trademarks/trademar.htm. Lotus and Lotus Notes are registered trademarks and Domino and Notes are trademarks of Lotus Development Corporation in the United States and/or other countries. NetView, Tivoli and TME are registered trademarks and TME Enterprise is a trademark of Tivoli Systems, Inc. in the United States and/or other countries. Microsoft, Windows, Windows NT and the Windows logo are registered trademarks of Microsoft Corporation in the United States and/or other countries. UNIX is a registered trademark in the United States and other countries licensed exclusively through The Open Group. LINUX is a registered trademark of Linus Torvalds. Intel and Pentium are registered trademarks and MMX, Itanium, Pentium II Xeon and Pentium III Xeon are trademarks of Intel Corporation in the United States and/or other countries. Java and all Java-based trademarks and logos are trademarks of Sun Microsystems, Inc. in the United States and/or other countries. Other company, product and service names may be trademarks or service marks of others. 30 © 2011 IBM Corporation Datenmanagement mit IBM General Parallel File System Disclaimers This information is provided on an "AS IS" basis without warranty of any kind, express or implied, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. Some jurisdictions do not allow disclaimers of express or implied warranties in certain transactions; therefore, this statement may not apply to you. This information is provided for information purposes only as a high level overview of possible future products. PRODUCT SPECIFICATIONS, ANNOUNCE DATES, AND OTHER INOFORMATION CONTAINED HEREIN ARE SUBJECT TO CHANGE AND WITHDRAWAL WITHOUT NOTICE. Important notes: IBM reserves the right to change product specifications and offerings at any time without notice. This publication could include technical inaccuracies or typographical errors. References herein to IBM products and services do not imply that IBM intends to make them available in all countries. IBM makes no warranties, express or implied, regarding non-IBM products and services, including but not limited to Year 2000 readiness and any implied warranties of merchantability and fitness for a particular purpose. IBM makes no representations or warranties with respect to non-IBM products. Warranty, service and support for non-IBM products is provided directly to you by the third party, not IBM. All part numbers referenced in this publication are product part numbers and not service part numbers. Other part numbers in addition to those listed in this document may be required to support a specific device or function. MHz / GHz only measures microprocessor internal clock speed; many factors may affect application performance. When referring to storage capacity, GB stands for one billion bytes; accessible capacity may be less. Maximum internal hard disk drive capacities assume the replacement of any standard hard disk drives and the population of all hard disk drive bays with the largest currently supported drives available from IBM. IBM Information and Trademarks The following terms are trademarks or registered trademarks of the IBM Corporation in the United States or other countries or both: the e-business logo, IBM, xSeries, pSeries, zSeries, iSeries. Intel, Pentium 4 and Xeon are trademarks or registered trademarks of Intel Corporation. Microsoft Windows is a trademark or registered trademark of Microsoft Corporation. Linux is a registered trademark of Linus Torvalds. Other company, product, and service names may be trademarks or service marks of others. 31 © 2011 IBM Corporation