大型稀疏矩陣的LU分解及特征值求解Grusoftcom課件
《大型稀疏矩陣的LU分解及特征值求解Grusoftcom課件》由會(huì)員分享,可在線閱讀,更多相關(guān)《大型稀疏矩陣的LU分解及特征值求解Grusoftcom課件(39頁(yè)珍藏版)》請(qǐng)?jiān)谘b配圖網(wǎng)上搜索。
1、單擊此處編輯母版標(biāo)題樣式,,單擊此處編輯母版文本樣式,,第二級(jí),,第三級(jí),,第四級(jí),,第五級(jí),,單擊此處編輯母版標(biāo)題樣式,,單擊此處編輯母版文本樣式,,第二級(jí),,第三級(jí),,第四級(jí),,第五級(jí),,,,*,單擊此處編輯母版標(biāo)題樣式,,單擊此處編輯母版文本樣式,,第二級(jí),,第三級(jí),,第四級(jí),,第五級(jí),,,,*,,,,本作品采用,知識(shí)共享署名,-,非商業(yè)性使用,2.5,中國(guó)大陸許可協(xié)議,進(jìn)行許可。,,專業(yè)交流,模板超市,設(shè)計(jì)服務(wù),NordriDesign,中國(guó)專業(yè),PowerPoint,媒體設(shè)計(jì)與開(kāi)發(fā),本作品的提供是以適用知識(shí)共享組織的公共許可( 簡(jiǎn)稱“,CCPL”,或 “許可”) 條款為前提的。本作品
2、受著作權(quán)法以及其他相關(guān)法律的保護(hù)。對(duì)本作品的使用不得超越本許可授權(quán)的范圍。,,如您行使本許可授予的使用本作品的權(quán)利,就表明您接受并同意遵守本許可的條款。在您接受這些條款和規(guī)定的前提下,許可人授予您本許可所包括的權(quán)利。,,,,查看全部,…,單擊此處編輯母版標(biāo)題樣式,,單擊此處編輯母版文本樣式,,第二級(jí),,第三級(jí),,第四級(jí),,第五級(jí),,單擊此處編輯母版標(biāo)題樣式,,單擊此處編輯母版文本樣式,,第二級(jí),,第三級(jí),,第四級(jí),,第五級(jí),,,,*,單擊此處編輯母版標(biāo)題樣式,,單擊此處編輯母版文本樣式,,第二級(jí),,第三級(jí),,第四級(jí),,第五級(jí),,,,*,單擊此處編輯母版標(biāo)題樣式,,單擊此處編輯母版文本樣式,,第
3、二級(jí),,第三級(jí),,第四級(jí),,第五級(jí),,單擊此處編輯母版標(biāo)題樣式,,單擊此處編輯母版文本樣式,,第二級(jí),,第三級(jí),,第四級(jí),,第五級(jí),,2013. 7. 20,大型稀疏矩陣,,的,LU,分解及特征值求解,陳英時(shí),,2016 . 1. 9,稀疏矩陣求解的廣泛應(yīng)用,矩陣求解是數(shù)值計(jì)算的核心,[1],,稀疏矩陣求解是數(shù)值計(jì)算的關(guān)鍵之一,,偏微分方程,積分方程,特征值,優(yōu)化,…,,,萬(wàn)階以上,dense matrix,不可行,,稀疏矩陣求解往往是資源瓶頸,,時(shí)間瓶頸,內(nèi)存,外存等瓶頸,,,直接法基于高斯消元法,即計(jì)算,A,的,LU,分解。,A,通常要經(jīng)過(guò)一系列置換排序,可歸并為左置換矩陣,P,,右置換
4、矩陣,Q,。基本步驟如下:,,1,)符號(hào)分析,:,,得到置換排序矩陣,P Q,,,2,)數(shù)值分解:,,,,3,)前代,回代:,I.S.Duff, A.M.Erisman, and J.K.Reid. Direct Methods for Sparse Matrices. London:Oxford Univ. Press,1986.,,J.J.Dongarra,I.S.Duff, ... Numerical Linear Algebra for High-Performance Computers.,,G.H.Golob, C.F.Van loan. Matrix Computations
5、. The Johns Hopkins University Press. 1996,稀疏矩陣復(fù)雜、多變,,基本參數(shù),,,對(duì)稱性,稀疏性,非零元分布,,敏感性,病態(tài)矩陣,,,條件數(shù),,格式多變,,,Harwell-Boeing Exchange Format,。。。,,測(cè)試集,,,Harwell-Boeing Sparse Matrix Collection,,,UF sparse matrix collection,,,,求解器的飛速發(fā)展,BBMAT,,,38744,階,分解后元素超過(guò)四千萬(wàn),.,,1988,巨型機(jī),cray-2,上,>1000,秒,,2003 4G umfpac
6、k4 32.6,秒,[4],,2006 3.0G GSS1.2 15,秒,,2012 3.0G 4,核,GSS 2.3 4,秒,,2014 i7 8,核,GSS 2.4 1,秒,,硬件的發(fā)展,CPU,,內(nèi)存等,,稀疏算法逐漸成熟,,,,稀疏排序,密集子陣,multifrontal ,supernodal…,,數(shù)學(xué)庫(kù),,,BLAS,,,LAPACK,,稀疏,LU,分解算法的關(guān)鍵,根據(jù)符號(hào)分析,數(shù)值分解算法的不同,直接法有以下幾類,[15],:,,1,),general technique(,通用方法,),:,,主要采用消去樹(shù)等結(jié)構(gòu)進(jìn)行,LU,分解。適用于非常稀疏的非結(jié)構(gòu)化矩陣。,,
7、2,),frontal scheme(,波前法,),,LU,分解過(guò)程中,將連續(xù)多行合并為一個(gè)密集子塊,(,波前,),,對(duì)這個(gè)子塊采用,BLAS,等高效數(shù)學(xué)庫(kù)進(jìn)行分解。,,3,),multifrontal method(,多波前法,),,,將符號(hào)結(jié)構(gòu)相同的多行,(,列,),合并為多個(gè)密集子塊,以這些密集子塊為單位進(jìn)行,LU,分解。這些子塊的生成,消去,裝配,釋放等都需要特定的數(shù)據(jù)結(jié)構(gòu)及算法。,,1,零元是動(dòng)態(tài)的概念,需要,稀疏排序,,,減少注入元,(fill-in),2,密集子陣,稀疏,LU,分解 (不考慮,零元,的),LU,分解,稀疏排序,,排序算法的作用是減少矩陣,LU,分解過(guò)程中產(chǎn)生的注
8、入元,求解矩陣的最優(yōu)排序方法是個(gè),NP,完全問(wèn)題(不能夠在合理的時(shí)間內(nèi)進(jìn)行求解),對(duì)具體矩陣而言,目前也沒(méi)有方法或指標(biāo)來(lái)判定哪種算法好。因此實(shí)測(cè)不同的算法,對(duì)比產(chǎn)生的注入元,是確定排序算法的可靠依據(jù)。,,,主要的排序算法有最小度排序(,MMD,,,minimum degree ordering algorithm,)和嵌套排序(,nested dissection,)兩種,。,矩陣排序方面相應(yīng)的成熟軟件庫(kù)有,AMD[12],、,COLAMD,、,METIS[13],等。,,Yannokakis M. Computing the minimum fill-in in NP-Complete. S
9、IAM J.Algebraic Discrete Methods, 1981, 2:77~79,近似最小度,排序算法,–,商圖,,,近似最小度排序(,AMD,,,Approximate Minimum Degree OrderingAlgorithm,)算法于,1996,年左右由,Patrick R. Amestoy,等學(xué)者提出,AMESTOY, P. R., DAVIS, ... 1996a. An approximate minimum degree ordering algorithm. SIAM J. Matrix Anal. Applic. 17, 4, 886,–,905.,為何需
10、要密集子塊(Dense Matrix),,,多波前法,(multifrontal),簡(jiǎn)介,發(fā)展,,,Duff and Raid [2],,等分析,改進(jìn),[3],,開(kāi)發(fā),UMFPACK [4],,基本算法,,,利用稀疏矩陣的特性,得到一系列密集子陣(波前)。將,LU,分解轉(zhuǎn)化為對(duì)這些波前的裝配,消去,更新等操作。,,多波前法的優(yōu)點(diǎn),,波前是,dense matrix ,,可直接調(diào)用高性能庫(kù)(,BLAS,等),,密集子陣可以節(jié)省下標(biāo)存儲(chǔ),,提高并行性,,目前主要的求解器,,,UMFPACK,,,,GSS,,MUMPS,PARDISO,,WSMP,,HSL MA41,等,,LU,分解形成,front
11、al,10,階矩陣。,,藍(lán)點(diǎn)代表非零元。紅點(diǎn)表示分解產(chǎn)生的注入元,(fill-in),,Frontal,劃分,{a}, {c}qxifjcr {e} {f,g}{h,i,j},Frontal,的裝配,消去,更新過(guò)程,a,c h,,c · ·,,h · ·,,{c},{f,g},,{e},{h,i,j},{a},c,,g,h,,g · ·,,h · ·,b,e j,,e · ·,,j · ·,f,,g,h,,g,g,·,,h · ·,e,,i,j,,i · ·,,j · ·,h,,i, j,,i,i,·,,j ·,j,2exndmr,d,,i,j,,i · ·,,j · ·,消去
12、樹(shù),消去樹(shù),消去樹(shù)是符號(hào)分析的關(guān)鍵結(jié)構(gòu),其每個(gè)節(jié)點(diǎn)對(duì)應(yīng)于矩陣的一列(行),該節(jié)點(diǎn)只與其父節(jié)點(diǎn)相連,父節(jié)點(diǎn)定義如下:,J.W.H.Liu. The Role of Elimination Trees in Sparse Factorization. SIAM J.Matrix Anal.Applic.,11(1):134-172,1990.,,J.W.H.Liu. Elimination structures for unsymmetric sparse LU factors. SIAM J. Matrix Anal. Appl. 14, no. 2, 334--352, 1993.,,GSS,
13、簡(jiǎn)介,標(biāo)準(zhǔn),C,開(kāi)發(fā),適用于各種平臺(tái),,比,INTEL PARDISO,更快,更穩(wěn)定,,CPU/GPU,混合計(jì)算,,突破,32,位,Windows,內(nèi)存限制,,32,個(gè)用戶參數(shù),,,支持用戶定制模塊,高校,研究所,,中國(guó)電力科學(xué)研究院,,香港大學(xué) 南京大學(xué) 河海大學(xué),,中國(guó)石油大學(xué),,電子科技大學(xué),,三峽大學(xué) 山東大學(xué),,user,detail,Why they choose GSS,crosslight,Industry leader in TCAD simulation,Hybrid GPU/CPU version, more than 2 times faster than PARDIS
14、O, MUMPS and other sparse solvers.,soilvision,The most technically advanced suite of 1D/2D/3D geotechnical software,Much faster than their own sparse solver.,FEM consulting,The leading research teams in the area of the Finite Element Method since 1967,GSS is faster than PARDISO and provide many cust
15、om module.,GSCAD,Leading building software in China,GSS provide a user-specific module to deal with ill-conditioned matrix.,ICAROS,A global turnkey geospatial mapping service provider and state of the art photogrammetric technologies developer.,GSS is faster than PARDISO. Also provide some technical
16、 help.,EPRI,China Electric Power Research Institute,3-4 times faster than KLU,他們?yōu)槭裁催x擇,GSS?,GSS,,–,加權(quán),消去樹(shù),工作量消去樹(shù),,基于消去樹(shù)結(jié)構(gòu)來(lái)計(jì)算數(shù)值分解的工作量,將,LU,分解劃分為多個(gè)獨(dú)立的任務(wù),為高效并行計(jì)算奠定基礎(chǔ)。,GSS,,-,雙閾值列選主元算法,GSS,,- CPU/GPU,混合計(jì)算,1) After divides LU factorization into many parallel tasks, GSS will use adaptive strategy to run th
17、ese tasks in different hardware (CPU, GPU …). That is, if GPU have high computing power, then it will run more tasks automatically. If CPU is more powerful, then GSS will give it more tasks.,,,2) And furthermore, if CPU is free (have finished all tasks) and GPU still run a task, then GSS will divide
18、 this task to some small tasks then assign some child-task to CPU, then CPU do computing again. So get higher parallel efficiency.,,,3) GSS will also do some testing to get best parameters for different hardware.,GSS,,–,求解,頻域譜元方法生成,的,矩陣,矩陣較稠密,,約,40,萬(wàn)階,,15,億個(gè)非零元,GSS,約需,15G,內(nèi)存,需要求解多個(gè)右端項(xiàng),,32,個(gè)右端項(xiàng) 需,80
19、,秒?,進(jìn)一步優(yōu)化,,CPU/GPU,混合計(jì)算 數(shù)值分解約,35,秒,,重復(fù)利用符號(hào)分析結(jié)果,,根據(jù)矩陣的特殊結(jié)構(gòu)來(lái)進(jìn)一步減少非零元,,估計(jì),80/4=20,秒,一次,LU,分解,,符號(hào)分解時(shí)間,50,秒,,數(shù)值分解時(shí)間,46,秒,,回代,2.5,秒,對(duì)比測(cè)試,The test matrices are all from the UF sparse matrix collection,,PARDISO is from Intel Composer XE 2013 SP1.,,GSS 2.4 use CPU-GPU hybrid computing.,,The testing CPU is I
20、NTEL Core i7-4770(3.4GHz) with 24G memory. The graphics card is ASUS GTX780 (with compute capability 3.5). NVIDIA CUDA Toolkit is 5.5. The operating system is Windows 7 64. Both solvers use default parameters.,,,For large matrices need long time computing, GSS 2.4 is Nearly 3 times faster than PARDI
21、SO. For matrices need short time computing, PARDISO is faster than GSS. One reason is that complex synchronization between CPU/GPU do need some extra time.,大型稀疏矩陣的特征值求解,重視,,A,為,全過(guò)程動(dòng)態(tài)仿真程序中大規(guī)模稀疏矩陣,為,特征值,,x,為對(duì)應(yīng)的特征向量,150 Years old and still alive: eigenproblems,,,1997 --- by Henk A. van der Vorst , Gene
22、 H. Golub,稀疏,LU,分解,-,理論上即高斯消元法,,稀疏特征值,–,趨近于純數(shù)學(xué),代數(shù)特征值問(wèn)題,,,G.H.Golob, C.F.Van loan. Matrix Computations. The Johns Hopkins University Press. 1996,,Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide. Zhaojun Bai , …. 2000,冪迭代,-- Power iteration,冪迭代,--,一個(gè)形象的解釋,瑞利商迭代,-- Raylei
23、gh Quotient iteration,子空間迭代,,Krylov,子空間,Arnoldi,迭代,Arnoldi,迭代的基本算法,ARPACK,,,ON RESTARTING THE ARNOLDI METHOD FOR LARGENONSYMMETRIC EIGENVALUE PROBLEMS, MORGAN,,R.B. Lehoucq. Analysis and Implementation of an Implicitly Restarted Iteration. PhD thesis,Arnoldi,迭代求解特征值,Krylov,分解,,--,基于酉相似變換,(unitary si
24、milarity ),重視,基于酉相似變換的分解具有后向穩(wěn)定性,,---Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide,,,P.173,,標(biāo)準(zhǔn),Arnoldi,分解,廣義,Krylov,分解,,其中,Q,為酉矩陣,且可以連續(xù)疊加,Matrix Algorithms || EigenSystems. G.W.Stewart P.309,實(shí)測(cè)更,效率,,實(shí)測(cè)迭代次數(shù),運(yùn)行時(shí)間都減少約,1/3,。,(與,ARPACK,對(duì)比),Krylov-schur,分解的優(yōu)點(diǎn),Deflation,操作的基
25、本思路,也類似,更加復(fù)雜,。,易于挑選,ritz,值作為,implicit shift,易于,Deflation(Lock+Purge),Schur,分解將任何一個(gè)矩陣歸約為上三角矩陣,對(duì)角線即為該矩陣的特征值;并且在這條對(duì)角線上,特征值可以通過(guò)酉相似變換來(lái)任意排列。也就是說(shuō),在生成,Rayleigh,矩陣,并計(jì)算出所有的,ritz,值之后,可以把需要的,Ritz,值排到前面,而不需要的,Ritz,值排到后面,重啟之后,只有挑出來(lái)的,Ritz,值出現(xiàn)在序列中。,G. W. Stewart, A Krylov–Schur algorithm for large eigenproblems, SI
26、AM J. Matrix Anal. Appl., 23 (2001), pp. 601–614.,Krylov-schur,及其重啟,,考慮重啟后,,B,矩陣更加復(fù)雜,如右圖所示,包含重啟的,Krylov-Schur,分解算法,Matrix Algorithms || EigenSystems. G.W.Stewart P329-330,收斂速度更快,經(jīng)多個(gè)實(shí)際算例驗(yàn)證,其速度明顯快于目前通用的,ARPACK,,一般迭代次數(shù)僅為,ARPACK,的,60%-70%,。,,,Why?,最新算法,,Subspace iteration with approximate spectral proj
27、ection,FEAST AS A SUBSPACE ITERATION EIGENSOLVER ACCELERATED BY APPROXIMATE SPECTRAL PROJECTION,---,P,.,TAK,,,P,.,TANG,,,E,.,POLIZZI,可求出復(fù)平面內(nèi)指定區(qū)域內(nèi)的所有特征值,,主要用于對(duì)稱矩陣,需推廣到非對(duì)稱矩陣,基于,cauch,積分的,spectral projection,shift-invert,變換,標(biāo)準(zhǔn)的,shift-invert,變換,Matrix transformations for computing rightmost eigenvalues
28、of large sparse non-symmetric eigenvalue problems -- K.MEERBERGEN, D.ROOSE,Cayley,變換,Matrix transformations for computing rightmost eigenvalues of large sparse non-symmetric eigenvalue problems -- K.MEERBERGEN, D.ROOSE,,Cayley,變換,Cayley,變換特性,,1,平行于虛軸的直線,,映射到單位圓,,2,該直線左側(cè)的點(diǎn)被映射到單位圓內(nèi)部,,3,該直線右側(cè)的點(diǎn)被映射到單位
29、圓外部,Filter polynomial,--- Matrix Algorithms || EigenSystems. G.W.Stewart P.317,Aleksei Nikolaevich Krylov,(1863–1945),,showed in 1931 how to use sequences of the form {b, Ab, A2b, . . .} to construct the characteristic polynomial of a matrix. Krylov was a Russian applied mathematician whose scienti
30、fic interests arose from his early training in naval science that involved the theories of buoyancy, stability, rolling and pitching, vibrations, and compass theories. Krylov served as the director of the Physics–Mathematics Institute of the Soviet Academy of Sciences from 1927 until 1932, and in 19
31、43 he was awarded a “state prize” for his work on compass theory. Krylov was made a “hero of socialist labor,” and he is one of a few athematicians to have a lunar feature named in his honor—on the moon there is the “Crater Krylov.”,Walter Edwin Arnold,i (1917–1995),,was an American engineer who pub
32、lished this technique in 1951, not far from the time that Lanczos’s algorithm emerged. Arnoldi received his undergraduate degree in mechanical engineering from Stevens Institute of Technology, Hoboken, New Jersey, in 1937 and his MS degree at Harvard University in 1939. He spent his career working a
33、s an engineer in the Hamilton Standard Division of the United Aircraft Corporation where he eventually became the division’s chief researcher. He retired in 1977. While his research concerned mechanical and aerodynamic properties of aircraft and aerospace structures, Arnoldi’s name is kept alive by
34、his orthogonalization procedure.,附一,,附二 參考文獻(xiàn),[1] Numerical Analysis. Rainer Kress. Springer-Verlag. 1991,,[2] I.S.Duff, A.M.Erisman, and J.K.Reid. Direct Methods for Sparse Matrices. London:Oxford Univ. Press,1986.,,[3] J.W.H.Liu. The Multifrontal Method for Sparse Matrix Solution: Theory and Pract
35、ice. SIAM Rev., 34 (1992), pp. 82--109.,,[4] T.A.Davis. A column pre-ordering strategy for the unsymmetric-pattern multifrontal method, ACM Trans. Math. Software, vol 30, no. 2, pp. 165-195, 2004.,,[5] N.J.Higham. Accuracy and Stability of Numerical Algorithms. SIAM,2002,,[6] G..H.Golob, C.F.Van loa
36、n. Matrix Computations. The Johns Hopkins University Press. 1996,,[7] J.W.H.Liu. The Multifrontal Method for Sparse Matrix Solution: Theory and Practice. SIAM Rev., 34 (1992), pp. 82--109.,,[8] Fast PageRank Computation via a Sparse Linear System (Extended Abstract)?Gianna M. Del Corso,Antonio Gullí
37、, Francesco Romani.,,[9] Y.Saad, Iterative Methods for Sparse Linear Systems, PWS, Boston,1996,,[10] Y.S. Chen* ,Simon Li. Application of Multifrontal Method for Doubly-Bordered Sparse Matrix in Laser Diode Simulator. NUSOD,2004,,[11],陳英時(shí) 吳文勇等,.,采用多波前法求解大型結(jié)構(gòu)方程組,.,建筑結(jié)構(gòu),,2007,年,09,期,,[12],宋新立,,,陳英時(shí)等,.,電力系統(tǒng)全過(guò)程動(dòng)態(tài)仿真中大型稀疏線性方程組的分塊求解算法,,,,非常感謝各位老師,同學(xué)!,,,
- 溫馨提示:
1: 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
2: 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
3.本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 裝配圖網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 市教育局冬季運(yùn)動(dòng)會(huì)安全工作預(yù)案
- 2024年秋季《思想道德與法治》大作業(yè)及答案3套試卷
- 2024年教師年度考核表個(gè)人工作總結(jié)(可編輯)
- 2024年xx村兩委涉案資金退還保證書(shū)
- 2024年憲法宣傳周活動(dòng)總結(jié)+在機(jī)關(guān)“弘揚(yáng)憲法精神推動(dòng)發(fā)改工作高質(zhì)量發(fā)展”專題宣講報(bào)告會(huì)上的講話
- 2024年XX村合作社年報(bào)總結(jié)
- 2024-2025年秋季第一學(xué)期初中歷史上冊(cè)教研組工作總結(jié)
- 2024年小學(xué)高級(jí)教師年終工作總結(jié)匯報(bào)
- 2024-2025年秋季第一學(xué)期初中物理上冊(cè)教研組工作總結(jié)
- 2024年xx鎮(zhèn)交通年度總結(jié)
- 2024-2025年秋季第一學(xué)期小學(xué)語(yǔ)文教師工作總結(jié)
- 2024年XX村陳規(guī)陋習(xí)整治報(bào)告
- 2025年學(xué)校元旦迎新盛典活動(dòng)策劃方案
- 2024年學(xué)校周邊安全隱患自查報(bào)告
- 2024年XX鎮(zhèn)農(nóng)村規(guī)劃管控述職報(bào)告