|
|
|
PUBLICATIONS
|
|
|
Taiwan Computational Quantum Matter Software
Foundry
- Jifeng Yu and Ying-Jer Kao, Spin-1/2 J1-J2 Heisenberg
antiferromagnet on a square lattice: A plaquette renormalized tensor
network study, Phys. Rev. B 85 094407 (2012).
- J. F. Yu, S. C. Hsiao, Y.-J. Kao, GPU accelerated tensor
contractions in the plaquette renormalization scheme, Comput. Fluids
45, 55
- Ti-Yen Lan, Yun-Da Hsieh, Ying-Jer Kao, High-precision
Monte Carlo study of the three-dimensional XY model on GPU,
arXiv:1211.0780.
|
|
|
Development of Quantitative System for Risk
Analysis in Finance
- C.H. Han. Instantaneous Volatility Estimation by Fourier
Transform Methods. To appear on Handbook of Financial Econometrics and
Statistics (C.F. Lee eds.), Springer-Verlag, New York. 2013.
- C.H. Han Importance Sampling Estimation of Joint Default
Probability under Structural-Form Models with Stochastic Correlation.
Monte Carlo and Quasi-Monte Carlo Methods. Editors Leszek Plaskota and
Henryk Woźniakowski. Springer, 2012.
|
|
|
Solving Large-scale Numerical Problems on GPU
- Chenhan D. Yu, Weichung Wang*, and Dan'l Pierce. (2011) A
CPU-GPU Hybrid Approach for the Unsymmetric Multifrontal Method.
Parallel Computing, 37:759-770.
- Chenhan D. Yu and Weichung Wang. (Preprint) “Performance
Models and Workload Distribution Algorithms for Optimizing a Hybrid
CPU-GPU Multifrontal Solver.”
- Yaohung M. Tsai, Ray-Bing Chen, and Weichung Wang (2012).
“Tuning Block Size for QR Factorization on CPU-GPU Hybrid Systems.”
Special Session: Auto-Tuning for Multicore and GPU (ATMG) in
Conjunction with the IEEE 6th International Symposium on Embedded
Multicore SoCs, Aizu-Wakamatsu, Japan.
- Yukai Hung and Weichung Wang* (2012). Accelerating
Parallel Particle Swarm Optimization via GPU. Optimization Methods and
Software. 27(1):33-51.
- Ray-Bing Chen, Dai-Ni Hsieh, Ying Hung, and Weichung Wang*
(2013, Accepted). Optimizing Latin Hypercube Designs by Particle Swarm.
Statistics and Computing.
- Ray-Bing Chen, Yen-Wen Hsu, Ying Hung, and Weichung Wang.
(Preprint) “Central Composite Discrepancy-Based Uniform Designs for
Irregular Experimental Regions.”
- Cheng-Ying Chou, Yi-Yan Chuo, Yukai Hung, and Weichung
Wang*. (2011) A Fast Forward Projection Using Multithreads for
Multirays on GPUs in Medical Image Reconstruction. Medical Physics,
38(7):4052-4065.
- Cheng-Ying Chou, Yun Dong, Yukai Hung, Yu-Jiun Kao,
Weichung Wang*, Chien-Min Kao, and Chin-Tu Chen. (2012). Accelerating
Image Reconstruction in Dual-Head PET System by GPU and Symmetry
Properties. PLOS ONE 7(12): e50540.
- Quey-Liang Kao and Che-Rung Lee (2012, Dec). Design Fast
Matrix Algorithms on High-Performance Cloud Platforms. IEEE CloudCom
2012
|
|
|
A Mixed OpenMP/MPI Programming Framework for
Hybrid CPU/GPU Cluster Computing
- Tyng-Yeu Liang, Fu-Chun Lu, Jun-Yao Chiu,“ A Hybrid
Resource Reservation Method for Workflows in Clouds”, International
Journal of Grid and High Performance Computing (IJGHPC), volume 4,
issue 4, pp.1-21, December, 2012.
- Tyng-Yeu Liang, Yu-Wei Chang, Hung-Fu Li, “A CUDA
Programming Toolkit on Grids”, International Journal of Grid and
Utility Computing, vol. 3, no 2., pp.97-111, June, 2012.
- Tyng-Yeu Liang, Hung-Fu Li and Jun-Yao Chiu, “Enabling
Mixed OpenMP/MPI Programming on Hybrid CPU/GPU Computing Architecture”,
2012 Multicore and GPU Programming Models, Languages and Compilers
Workshop, collocated with 26th IEEE International Parallel &
Distributed Processing Symposium, pp.2369-2377, Shanghai, China, May
21-25, 2012.
- Che-Lun Hung, Chun-Yuan Lin, Hsiao-hsi Wang, Chin-Yuan
Chang, Efficient Packet Pattern Matching for Gigabit Network Intrusion
Detection using GPUs, accepted by The 2nd International Workshop on
Embedded Multi-Core computing and Applications (in conjuction with IEEE
ICESS 2012), 2012. (EI)
|
|
|
Accelerating Pattern Matching Using a
Novel Parallel Algorithm on GPUs
- Cheng-Hung Lin, Chen-Hsiung Liu, Lung-Sheng Chien, and
Shih-Chieh Chang, "Accelerating Pattern Matching Using a Novel Parallel
Algorithm on GPUs," accepted to be published in IEEE Transactions on
Computers. (SCI)
- Cheng-Hung Lin, Chen-Hsiung Liu, Shih-Chieh Chang, and
Wing-Kai Hon, "Memory-Efficient Pattern Matching Architectures Using
Perfect Hashing on Graphic Processing Units," in Proc. of the 31st
Annual IEEE International Conference on Computer Communications
(INFOCOM 2012), Orlando, Florida, USA, March 25-30, 2012.(Top
conference, Acceptance rate: 18%, 278/1547)
- Cheng-Hung Lin, Chen-Hsiung Liu, Lung-Sheng Chien,
Shih-Chieh Chang, and Wing-Kai Hon, "PFAC Library: GPU-based string
matching algorithm", accepted by GPU Technology Conference (GTC 2012),
San Jose, California, May 14-17, 2012.
- Cheng-Hung Lin and Jyh-Charn Liu, "M-DFA (multithreaded
DFA): An Algorithm for Reduction of State Transitions and Acceleration
of REGEXP Matching" in Proc. of ACM/IEEE Symposium on Architectures for
Networking and Communications Systems (ANCS 2012), Austin, Texas, USA,
Oct. 29-30, 2012.
- Cheng-Hung Lin, Chen-Hsiung Liu, and Shih-Chieh Chang,
"Accelerating Regular Expression Matching Using Hierarchical Parallel
Machines on GPU", in Proc. of IEEE GLOBAL COMMUNICATIONS CONFERENCE
(GLOBECOM 2011), Houston, Texas, USA, December 5-9, pp.1706-1710, 2011.
- Cheng-Hung Lin, Sheng-Yu Tsai, Chen-Hsiung Liu, Shih-Chieh
Chang, and Jyuo-Min Shyu, "Accelerating String Matching Using
Multi-threaded Algorithm on GPU," in Proc. IEEE GLOBAL COMMUNICATIONS
CONFERENCE (GLOBECOM 2010), Miami, Florida, USA, December 6-10, 2010.
(Google citation:13)
|
|
|
Bioinformatic
- Chun-Yuan Lin (*corresponding author) and Yu-Shiang Lin,
Efficient Parallel Algorithm for Multiple Sequence Alignments with
Regular Expression Constrains on Graphics Processing Units,appear to
International Journal of Computational Science and Engineering, 2012.
(EI)
- Chun-Yuan Lin, Sheng-Ta Li, and Che Lun Hung,
Frequency-based RE-Sequencing tool for short reads on Graphics
Processing Units,appear to International Journal of Computational
Science and Engineering, 2012. (EI)
- Sheng-Ta Lee, Chun-Yuan Lin(*corresponding author), Che
Lun Hung, Hsuan Ying Huang, Using Frequency Distance Filteration for
Reducing Database Search Workload on GPU-Based Cloud Service,accepted
by The 2012 International Workshop on Cloud Computing for
Bioinformatics and Its Applications (in conjuction with IEEE CloudCom
2012).(EI)
- Yu-Shiang Lin, Chun-Yuan Lin(*corresponding author), and
Yeh-Ching Chung, GPU-Based Cloud Service for Multiple Sequence
Alignments with Regular Expression Constrains,accepted by The 2012
International Workshop on Cloud Computing for Bioinformatics and Its
Applications (in conjuction with IEEE CloudCom
- Yu-Rong Chen, Che Lun Hung, Yu-Shiang Lin, Chun-Yuan Lin
(*corresponding author), Tien-Lin Lee, Kual-Zheng Lee, Parallel UPGMA
Algorithm on Graphics Processing Units Using CUDA,accepted by The third
International Workshop on Forntier of GPU Computing (in conjuction with
IEEE HPCC 2012), 2012. (EI)
- Yu-Shiang Lin, Chun-Yuan Lin (*corresponding author), and
Der-Chyuan Lou, Efficient Parallel RSA Decryption Algorithm for
Many-core GPUs with CUDA, accepted by 2012 International Conference on
Telecommunication Systems Management
- Kuan-Ju Lin, Yi-Hsuan Huang, and Chun-Yuan Lin
(*corresponding author), Efficient Parallel Knuth-Morris-Pratt
Algorithm for Multi-GPUs with CUDA, accepted by Workshop on Parallel,
Peer-to-peer, Distributed, and Cloud Computing, International Computer
Symposium 2012.
- Chun Yuan Lin (*corresponding author), Jen-Cheng Huang,
and Sheng-Ta Li, Accelerating Smith-Waterman Algorithm Using Frequency
Distance Filtration on Graphics Processing Units, accepted by The 17th
Mobile Computing Workshop, 2012.
- Chun Yuan Lin, Sheng-Ta Li, Che-Lun Hung, Chuan Yi Tang,
and Yaw-Ling Lin, CUDA-FRESCO: Frequency-based RE-Sequencing tool based
on CO-clustering segmentation by GPU, 2011 IEEE 13th International
Conference on High Performance Computing and Communications, 2011, pp.
857-862. (EI)
- Chun-Yuan Lin, Yu-Shiang Lin, Jiayi Zhou, Chuan Yi Tang,
GPU-REMuSiC: Efficient Constrained Multiple Sequence Alignment
Algorithm on Graphics Processing Units, Proceedings of the 16th
Workshop on Compiler Techniques for High-Performance and Embedded
Computing, 2011.
- Sheng-Ta Li, Chun-Yuan Lin (*corresponding author),
Yu-Shiang Lin, Joy Lee and Chuan Yi Tang, CUDA-FRESCO: an efficient
algorithm for mapping short reads on Graphics Processing Units with
CUDA,�Proceedings of the GPU Technology Conference 2010, pp.75.
|
|
|
Music
Information Retrieval
- Chung-Che Wang, Chieh-Hsing Chen, Chin-Yang Kuo, Li-Ting
Chiu, and Jyh-Shing Roger Jang, "Accelerating Query by Singing/Humming
on GPU: Optimization for Web Deployment", The 36th International
Conference on Acoustics, Speech, and Signal Processing (ICASSP), Kyoto,
Japan, March 2012.
|
|
|
Computer Graphics
- Min Shih, Yung-Feng Chiu, Ying-Chieh Chen, and Chun-Fa
Chang. Real-Time Ray Tracing with CUDA. International Conference on
Algorithms and Architectures for Parallel Processing (ICA3PP) 2009.
(EI)
|
|
|
HPC and GPU Performance Optimization
- Lung-Sheng Chien, “Hand-Tuned SGEMM on GT200 GPU”,
http://forums.nvidia.com/index.php?showtopic=159033
- Che-Rung Lee, Shih-Hsiang Lo, Nan-Hsi Chen, Yeh-Ching
Chung, I-Hsin Chung (2012, May). GPU Performance Enhancement via
Communication Cost Reduction: Case Studies of Radix Sort and WSN Relay
Node Placement Problem. IEEE/ACM CCGRID 2012, Ottawa, Canada.
- Che-Rung Lee, Zhi-Hung Chen, Quey-Liang Kao (2012, May).
Parallelizing the Hamiltonian Computation in DQMC Simulations:
Checkerboard Method for Sparse Matrix Exponentials on Multicore and
GPU. IEEE IPDPSW 2012, ShangHai, China.
- Shih-Hsiang Lo, Che-Rung Lee, Yeh-Ching Chung, Optimizing
Pairwise Box Intersection Checking on GPUs for Large-Scale Simulations,
accepted by ACM Transactions on Modeling and Computer Simulation
(TOMACS)
- Shih-Hsiang Lo; Che-Rung Lee; Quey-Liang Kao; I-Hsin
Chung; Yeh-Ching Chung, Improving GPU Memory Performance with
Artificial Barrier Synchronization, submitted to IEEE TPDS, under
revision.
- Che-Rung Lee, Shih-Hsiang Lo, Nan-Hsi Chen, Quey-Liang
Kao, Yeh-Ching Chung, I-Hsin Chung, (Preprint) Data Streaming and Data
Compression for GPU Performance Enhancement: Communication Cost
Reduction and Beyond, submitted to International Journal of Parallel
Programming .
- Chun-Yuan Lin(*corresponding author), Wei Sheng Lee, and
Chuan Yi Tang, Parallel Shellsort Algorithm for Many-core GPUs with
CUDA, appear to International Journal of Grid and High Performance
Computing, 2012.(EI)
- Chi-Cheng Chuang, Yu-Sheng Chiu, Quey-Liang Kao,
Zhi-Hung Chen and Che-Rung Lee (2012, Dec). Accelerating Block
Checkerboard Method on GPU for Performance Enhancement of 2D and 3D
Quantum Monte Carlo Simulations. IEEE CloudCom 2012, Taipei, Taiwan.
|