GCN-RA: A Graph Convolutional Network-based Resource Allocator for Reconfigurable Systems

[Home ]		[ فارسی ]

Main Menu

Home

About

Research

People

Facilities

Sientific Nwes & Events

Contact Us

Useful links

Photo Album

Research Lab.

Search in website

IJEEE

GCN-RA: A Graph Convolutional Network-based Resource Allocator for Reconfigurable Systems

| Post date: 2023/12/14 |

GCN-RA: A Graph Convolutional Network-based Resource Allocator for Reconfigurable Systems

Seyed Mehdi Mohtavipour

School of Electrical Engineering
Iran University of Science and Technology
Tehran, Iran

Hadi Shahriar Shahhoseini
School of Electrical Engineering
Iran University of Science and Technology
Tehran, Iran

PDF │ Abstract │ Keywords │ References │ Cite This

Abstract:

Nowadays, hardware architectures with various reconfiguration capabilities provide significant computational speedup using parallelism and concurrency features. However, data transmission after allocating resources to the application is one of the critical challenges that produces communicational delays and time overheads, specifically in the execution of large-scale applications. This paper proposes a novel 2-level inter and intra-cluster resource allocation approach to reduce communication distances by defining regional resources. The requested application is partitioned through a customized Graph Convolutional Network (GCN) to achieve high-quality, independent parts with the lowest overhead and map them to adjacent or non-adjacent regions with an analytical distance metric. Using this approach, it is possible to efficiently obtain the general configuration of the final map solution by ignoring long hop-count distances and improving the convergence rate of the optimization algorithms. Many experiments on large-scale synthetic and real applications derived from CAD flow have been performed to evaluate the effectiveness of the proposed approach in comparison with the previous works. The results showed that in fixed optimization iterations, up to 15.97 % improvement in mapping solution quality has been achieved, and the resource utilization factor has reached 96.75 % value.

Keywords: Reconfigurable architecture; Graph processing; Distributed computing; Neural networks

References:
[1] M. Iacono, M. Gribaudo, J. Kołodziej, F. Pop, Modeling and evaluation of highly complex computer systems architectures, J. Comput. Sci. 22 (2017) 126–130.
[2] A. Alkamil, D.G. Perera, Towards dynamic and partial reconfigurable hardware architectures for cryptographic algorithms on embedded devices, IEEE Access 8 (2020) 221720–221742.
[3] V.S. Vranjkovi´c, R.J. Struharik, L.A. Novak, Reconfigurable hardware for machine learning applications, J. Circuits Syst. Comput. 24 (2015), 1550064.
[4] M.P. V´estias, A survey of convolutional neural networks on edge with reconfigurable computing, Algorithms 12 (2019) 154.
[5] J. Hoozemans, J. Peltenburg, F. Nonnemacher, A. Hadnagy, Z. Al-Ars, H.P. Hofstee, FPGA acceleration for big data analytics: challenges and opportunities, IEEE Circuits Syst. Mag. 21 (2021) 30–47.
[6] L. Liu, J. Zhu, Z. Li, Y. Lu, Y. Deng, J. Han, S. Yin, S. Wei, A survey of coarse-grained reconfigurable architecture and design: taxonomy, challenges, and applications, ACM Comput. Surv. (CSUR) 52 (6) (2019) 1–39.
[7] A. Podobas, K. Sano, S. Matsuoka, A survey on coarse-grained reconfigurable architectures from a performance perspective, IEEE Access 8 (2020) 146719–146743.
[8] K. Vipin, S.A. Fahmy, FPGA dynamic and partial reconfiguration: a survey of architectures, methods, and applications, ACM Comput. Surv. (CSUR) 51 (2018) 1–39.
[9] H. Kchaou, Z. Kechaou, A.M. Alimi, A PSO task scheduling and IT2FCM fuzzy data placement strategy for scientific cloud workflows, J. Comput. Sci. 64 (2022), 101840.
[10] A. Yoosefi, H.R. Naji, A clustering algorithm for communication-aware scheduling of task graphs on multi-core reconfigurable systems, IEEE Trans. Parallel Distrib. Syst. 28 (10) (2017) 2718–2732.
[11] H. Chniter, O. Mosbahi, M. Khalgui, M. Zhou, Z. Li, Improved multi-core real-time task scheduling of reconfigurable systems with energy constraints, IEEE Access 8 (2020) 95698–95713.
[12] Z. Zhu, J. Zhang, J. Zhao, J. Cao, D. Zhao, G. Jia, Q. Meng, A hardware and software task-scheduling framework based on CPU+ FPGA heterogeneous architecture in edge computing, IEEE Access 7 (2019) 148975–148988.
[13] T. Marconi, Online scheduling and placement of hardware tasks with multiple variants on dynamically reconfigurable field-programmable gate arrays, Comput. Electr. Eng. 40 (2014) 1215–1237.
[14] A. Silva, L.C. Coelho, M. Darvish, Quadratic assignment problem variants: a survey and an effective parallel memetic iterated tabu search, Eur. J. Oper. Res. 292 (2021) 1066–1084.
[15] Z. Guan, J.S. Wong, S. Chaudhuri, G. Constantinides, P.Y. Cheung, A two-stage variation-aware placement method for FPGAs exploiting variation maps classification, in: Proceedings of the Twenty Second International Conference on Field Programmable Logic and Applications (FPL), 2012, 519–522.
[16] R.Z. Chochaev, P.I. Frolova, Initial placement algorithms for island-style FPGAs. in: Proceedings of the Conference of Russian Young Researchers in Electrical and Electronic Engineering, ElConRus, 2022, pp. 586–589.
[17] V.I. Enns, S.V. Gavrilov, R.Z. Chochaev, Automatic FPGA placement configuration for customer designs, Russ. Microelectron. 51 (2022) 579–584.
[18] G. Chen, C.W. Pui, W.K. Chow, K.C. Lam, J. Kuang, E.F. Young, B. Yu, RippleFPGA: routability-driven simultaneous packing and placement for modern FPGAs, IEEE Trans. Comput. -Aided Des. Integr. Circuits Syst. 37 (2017) 2022–2035.
[19] S.A. Chin, J.H. Anderson, An architecture-agnostic integer linear programming approach to CGRA mapping, in; Proceedings of the Fifty Fifth Annual Design Automation Conference, 2018, 1-6.
[20] M.J. Walker, J.H. Anderson, Generic connectivity-based CGRA mapping via integer linear programming. in: Proceedings of the Twenty Seventh Annual International Symposium on Field-Programmable Custom Computing Machines, FCCM, 2019, pp. 65–73.
[21] S. Yang, S. Le Nours, M.M. Real, S. Pillement, 0–1 ILP-based run-time hierarchical energy optimization for heterogeneous cluster-based multi/many-core systems, J. Syst. Archit. 116 (2021), 102035.
[22] H. Zhang, X. Wang, KGT: an application mapping algorithm based on kernighan–lin partition and genetic algorithm for WK-recursive NoC architecture, Int. Conf. Intell. Comput. (2021) 86–101.75
[23] T. Kojima, N.A.V. Doan, H. Amano, GenMap: a genetic algorithmic approach for optimizing spatial mapping of coarse-grained reconfigurable architectures, IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 28 (11) (2020) 2383–2396.
[24] M. El-Abd, H. Hassan, M. Anis, M.S. Kamel, M. Elmasry, Discrete cooperative particle swarm optimization for FPGA placement, Appl. Soft Comput. 10 (1) (2010) 284–295.
[25] L. Zhou, D. Liu, B. Zhang, H. Liu, Ant colony optimization for application mapping in coarse-grained reconfigurable array, Int. Symp. . Appl. Reconfigurable Comput. (2013) 219. -219.
[26] J. Yuan, J. Chen, L. Wang, X. Zhou, Y. Xia, J. Hu, ARBSA: Adaptive range-based simulated annealing for FPGA placement, IEEE Trans. Comput. -Aided Des. Integr. Circuits Syst. 38 (2018) 2330–2342.
[27] S.A. Chin, N. Sakamoto, A. Rui, J. Zhao, J.H. Kim, Y. Hara-Azumi, J. Anderson, CGRA-ME: A unified framework for CGRA modelling and exploration. in: Proceedings of the Twenty Eightth International Conference on Application- specific Systems, Architectures and Processors, ASAP, 2017, pp. 184–189.
[28] K.E. Murray, O. Petelin, S. Zhong, J.M. Wang, M. Eldafrawy, J.P. Legault, E. Sha, A. G. Graham, J. Wu, M.J. Walker, H. Zeng, Vtr 8: high-performance cad and customizable fpga architecture modelling, ACM Trans. Reconfigurable Technol. Syst. (TRETS) 13 (2020) 1–55.
[29] S.M. Mohtavipour, H.S. Shahhoseini, A quad-form clustered mapping approach for large-scale applications of reconfigurable computing systems, Comput. Electr. Eng. 97 (2022), 107637.
[30] S.M. Mohtavipour, H.S. Shahhoseini, A link-elimination partitioning approach for application graph mapping in reconfigurable computing systems, J. Supercomput. 76 (2020) 726–754.
[31] S.M. Mohtavipour, H.S. Shahhoseini, A Low-Cost Distributed Mapping for Large-Scale Applications of Reconfigurable Computing Systems. in: Proceedings of the Tenty Fifth International Computer Conference, Computer Society of Iran, CSICC, 2020, pp. 1–6.
[32] F. Galea, S. Carpov, L. Zaourar, Multi-start simulated annealing for partially-reconfigurable FPGA floorplanning, IEEE Int. Parallel Distrib. Process. Symp. . Workshops (IPDPSW) (2018) 1335–1338.
[33] S. Yin, D. Liu, L. Sun, L. Liu, S. Wei, DFGNet: mapping dataflow graph onto CGRA by a deep learning approach, Int. Symp. Circuits Syst. (ISCAS) (2017) 1–4.
[34] A. Al-Hyari, A. Shamli, T. Martin, S. Areibi ,G. Grewal, An adaptive analytic FPGA placement framework based on deep-learning, in: Proceedings of the 2020 ACM/IEEE Workshop on Machine Learning for CAD, 2020, 3–8, .
[35] D. Liu, S. Yin, G. Luo, J. Shang, L. Liu, S. Wei, Y. Feng, S. Zhou, Data-flow graph mapping optimization for CGRA with deep reinforcement learning, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 38 (2018) 2271–2283.
[36] M.A. Elgammal, K.E. Murray, V. Betz, RLPlace: using reinforcement learning and smart perturbations to optimize FPGA placement, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 41 (2021) 2532–2545.
[37] Zhang, J., Deng, F. , Yang, X., FPGA placement optimization with deep reinforcement learning in: Proceedings of the Second International Conference on Computer Engineering and Intelligent Control (ICCEIC), 73–76, 2021.
[38] A. Mirhoseini, A. Goldie, M. Yazgan, J.W. Jiang, E. Songhori, S. Wang, Y.J. Lee, E. Johnson, O. Pathak, A. Nazi, J. Pak, A graph placement methodology for fast chip design, Nature 594 (2021) 207–212.
[39] D.S. Lopera, L. Servadei, G.N. Kiprit, S. Hazra, R. Wille, W. Ecker, A survey of graph neural networks for electronic design automation. in: Proceedings of the ACM/IEEE Third Workshop on Machine Learning for CAD (MLCAD), 2021, pp. 1–6.
[40] Z. Li, D. Wijerathne, X. Chen, A. Pathania, T. Mitra, Chordmap: automated mapping of streaming applications onto cgra, IEEE Trans. Comput. -Aided Des. Integr. Circuits Syst. 41 (2021) 306–319.
[41] J.W. Yoon, A. Shrivastava, S. Park, M. Ahn, Y. Paek, A graph drawing based spatial mapping algorithm for coarse-grained reconfigurable architectures, IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 17 (2009) 1565–1578.
[42] R. Ferreira, L. Rocha, A. Santos, J. Nacif, S. Wong, L. Carro, A run-time graph-based polynomial placement and routing algorithm for virtual fpgas, in: Proceedings of the Twenty Third International Conference on Field programmable Logic and Applications, 2013, 1–8.
[43] M. Canesche, M. Menezes, W. Carvalho, F.S. Torres, P. Jamieson, J.A. Nacif, R. Ferreira, Traversal: a fast and adaptive graph-based placement and routing for cgras, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 40 (2020) 1600–1612.
[44] M. Canesche, W. Carvalho, L. Reis, M. Oliveira, S. Magalhaes, P. Jamieson, J. M. Nacif, R. Ferreira, You only traverse twice: a YOTT placement, routing, and timing approach for CGRAs, ACM Trans. Embed. Comput. Syst. (TECS) 20 (2021) 1–25.
[45] L. Chen, T. Mitra, Graph minor approach for application mapping on cgras, ACM Trans. Reconfigurable Technol. Syst. (TRETS) 7 (3) (2014) 1–25.
[46] T.N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, arXiv Prepr. arXiv 1609 (2016) 02907.
[47] S.K. Maurya, X. Liu, T. Murata, Simplifying approach to node classification in Graph Neural Networks, J. Comput. Sci. 62 (2022), 101695.
[48] W. Turek, J. Stypka, D. Krzywicki, P. Anielski, K. Pietak, A. Byrski, M. Kisiel-Dorohinicki, Highly scalable erlang framework for agent-based metaheuristic computing, J. Comput. Sci. 17 (2016) 234–248.
[49] H. Daryanavard, M. Eshghi, A. Jahanian, A fast placement algorithm for embedded just-in-time reconfigurable extensible processing platform, J. Supercomput. 71 (2015) 121–143.
[50] R. Collier, C. Fobel, L. Richards, G. Grewal, A formal and empirical analysis of recombination for genetic algorithm-based approaches to the FPGA placement problem, 25th IEEE Can. Conf. Electr. Comput. Eng. (CCECE) (2012) 1–6.
[51] S.A. Chin, N. Sakamoto, A. Rui, J. Zhao, J.H. Kim, Y. Hara-Azumi, J. Anderson, CGRA-ME: a unified framework for CGRA modelling and exploration. in: Proceedings of the Twenty Eighth International Conference on Application-specific Systems, Architectures and Processors, ASAP, 2017, pp. 184–189.
[52] F. Bouwens, M. Berekovic, B.D. Sutter, G. Gaydadjiev, Architecture enhancements for the ADRES coarse-grained reconfigurable array, Int. Conf. High. Perform. Embed. Archit. Compil. (2008) 66–81.
[53] Data61, C., StellarGraph Machine Learning Library, 2018.
[54] H. Zeng, H. Zhou, A. Srivastava, R. Kannan, V. Prasanna, Accurate, efficient and scalable training of Graph Neural Networks, J. Parallel Distrib. Comput. 147 (2021) 166–183.
[55] K. Hornik, I. Feinerer, M. Kober, C. Buchta, Spherical k-means clustering, J. Stat. Softw. 50 (2012) 1–22.
[56] Perozzi, B., Al-Rfou, R. , Skiena, S., Deepwalk: online learning of social representations, in: Proceedings of the Twentieth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 701–710, 2014.
[57] A. Ng, M. Jordan, Y. Weiss, On spectral clustering: analysis and an algorithm, Adv. Neural Inf. Process. Syst. 14 (2001).
[58] S. Cao, W. Lu, Q. Xu, Deep neural networks for learning graph representations, Proc. AAAI Conf. Artif. Intell. 30 (2016).
[59] J. Weston, F. Ratle, R. Collobert, Deep learning via semi-supervised embedding, in: Proceedings of the Twenty Fifth International Conference on Machine Learning, 2008, 1168–1175.
[60] M. Defferrard, X. Bresson, P. Vandergheynst, Convolutional neural networks on graphs with fast localized spectral filtering, Adv. Neural Inf. Process. Syst. 29 (2016).
[61] F.M. Bianchi, D. Grattarola, C. Alippi, Spectral clustering with graph neural networks for graph pooling, Int. Conf. Mach. Learn. (2020) 874–883.
[62] Z. Li, D. Wu, D. Wijerathne, T. Mitra, Lisa: graph neural network based portable mapping on spatial accelerators. in: Proceedings of the IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2022, pp. 444–459.
[63] X. Kong, Y. Huang, J. Zhu, X. Man, Y. Liu, C. Feng, P. Gou, M. Tang, S. Wei, L. Liu, MapZero: mapping for coarse-grained reconfigurable architectures with reinforcement learning and monte-carlo tree search in: Proceedings of the Fiftieth Annual International Symposium on Computer Architecture, 2023, 1–14.
[64] Gurobi Optimization, L.L.C., Gurobi Optimizer Reference Manual, 2023, 〈https://www.gurobi.com〉.
[65] Fey, M. , Lenssen, J.E., Fast graph representation learning with PyTorch Geometric arXiv preprint arXiv:1903.02428, 2019.
[66] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, Pytorch: an imperative style, high-performance deep learning library, Adv. Neural Inf. Process. Syst. 32 (2019).
[67] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, TensorFlow: a system for large-scale machine learning, in: Proceedings of the Twelfth USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), 2016, 265–283.

Cite this paper as:
Mohtavipour, S.M. and Shahhoseini, H.S., 2023. GCN-RA: A Graph Convolutional Network-based Resource Allocator for Reconfigurable Systems. Journal of Computational Science, p.102178.

View: 222 Time(s) | Print: 63 Time(s) | Email: 0 Time(s) | 0 Comment(s)

کلیه حقوق مادی و معنوی این سایت متعلق به پژوهشکده الکترونیک می باشد . نقل هرگونه مطلب با ذکر منبع بلامانع می باشد .