##plugins.themes.bootstrap3.article.main##

Thumb-2 is the most recent instruction set architecture for ARM processors which are one of the most widely used embedded processors. In this paper, two extensions are proposed to improve the performance of the Thumb-2 instruction set architecture, which are addressing mode extensions and sign/zero extensions combined with data processing instructions. To speed up access to an element of an aggregated data, the proposed approach first introduces three new addressing modes for load and store instructions. They are register-plus-immediate offset addressing mode, negative register offset addressing mode, and post-increment register offset addressing mode. Register-plus-immediate offset addressing mode permits two offsets and negative register offset allows offset to be a negative value of a register content. Post-increment register offset mode automatically modifies the offset address after the memory operation. The second is the sign/zero extension combined with a data processing instruction which allows the result of a data processing operation to be sign/zero extended to accelerate a type conversion. Several least frequently used instructions are reduced to provide the encoding space for the new extensions. Experiments show that the proposed approach improves performance by an average of 8.6% when compared to the Thumb-2 instruction set architecture.

Downloads

Download data is not yet available.

References

  1. https://www.businessweekly.co.uk/news/hi-tech/record-67bn-arm-chips-shipped-single-quarter-just-start, 2021.
     Google Scholar
  2. S. Segars, K. Clarke, and L. Goudge, "Embedded control problems, Thumb, and the ARM7TDMI," IEEE Micro, Vol. 15, No. 5, pp.22-30, 1995.
     Google Scholar
  3. K. Kissell, ?MIPS16: High-Density MIPS for the Embedded Market,? Technical report, Silicon Graphics MIPS Group, 1997.
     Google Scholar
  4. R. Phelan, ?Improving ARM Code Density and Performance,? Technical Report, ARM Ltd., 2003.
     Google Scholar
  5. F. Hedley, ARM DSP-Enhanced Extensions, ARM Ltd., 2001.
     Google Scholar
  6. ARM Ltd., Introducing NEON? Development Article, ARM Ltd., 2009.
     Google Scholar
  7. ARM Ltd., ARMv8 Instruction Set Overview, ARM Ltd., 2012.
     Google Scholar
  8. ARM Ltd., ARM Architecture Reference Manual ARMv8, for ARMv8-A architecture profile, ARM Ltd., 2013.
     Google Scholar
  9. ARM Ltd., Introduction to Armv8.1-M architecture. 2019.
     Google Scholar
  10. B. Li, and R. Gupta, "Bit Section Instruction Set Extension of ARM for Embedded Applications," In Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), Grenoble, France, pp.69-78, 2002.
     Google Scholar
  11. H. -J., Cheng, Y. -S. Hwang, R. -G. Chang, and C. -W. Chen, "Trading Conditional Execution for More Registers on ARM Processors," In Proceedings of the 8th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing (EUC), Hong Kong, China, pp.53-59, 2010.
     Google Scholar
  12. H. -H. Chiang, H. -J. Cheng, and Y. -S. Hwang, "Doubling the Number of Registers on ARM Processors," In Proceedings of the 16th Workshop on Interaction between Compilers and Computer Architectures (INTERACT-16), Louisiana, USA, pp.1-8, 2012.
     Google Scholar
  13. J. W. Bos, P. L. Montgomery, D. Shumow, G. M. Zaverucha, "Montgomery Multiplication Using Vector Instructions," IACR Cryptology ePrint Archive, pp. 519-535, 2013.
     Google Scholar
  14. W. Erich, U. Thomas, W. Mario, "8/16/32 Shades of Elliptic Curve Cryptography on Embedded Processors", In Proceedings of the 14th International Conference on Cryptology in India, Mumbai, India, pp.244-261, 2013.
     Google Scholar
  15. A. C. Murray, R. V. Bennett, B. Franke, N. Topham, "Code transformation and instruction set extension". ACM Transactions on Embedded Computing Systems, Vol. 8, No. 4, pp. 1-31, 2009.
     Google Scholar
  16. J. Goodacre, A. N. Sloss, "Parallelism and the ARM instruction set architecture," IEEE Computer, Vol. 38, No. 7, pp.42-50, 2005.
     Google Scholar
  17. A. Krishnaswamy, R. Gupta, "Efficient Use of Invisible Registers in Thumb Code", In Proceedings of the 38th IEEE/ACM International Symposium on Microarchitecture, Barcelona, Spain, pp.30-42, 2005.
     Google Scholar
  18. A. Krishnaswamy, R. Gupta, "Dynamic coalescing for 16-bit instructions," ACM Transaction on Embedded Computing System, Vol. 4, No. 1, pp. 3-37, 2005.
     Google Scholar
  19. J. H. Lee, J. Park, S. M. Moon, "Securing More Registers with Reduced Instruction Encoding Architectures", In Proceedings of the 13th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, Vol. 2, Washington, pp.417-425, 2007
     Google Scholar
  20. J. H. Lee, S. M. Moon, H. K. Choi, "Comparison of Bank Change Mechanisms for Banked Reduced Encoding Architectures", In Proceedings of the International Conference on Computational Science and Engineering Vol. 2, Canada, pp.334-341, 2009.
     Google Scholar
  21. D.-H. Kim, "Addressing Mode Extension to the ARM/Thumb Architecture", Advances in Electrical and Computer Engineering, Vol.14, No. 2, pp.85-88, 2014.
     Google Scholar
  22. D.-H. Kim, S.-W. Kim, "Extending Offset Addressing Mode and Post-Indexed Addressing Mode of Thumb-2 Instruction Set Architecture", The Journal of Korean Institute of Next Generation Computing, Vol.9, No.6, pp.6-14, 2013.
     Google Scholar
  23. A. M. Fiskiran, R. B. Lee, "Performance Impact of Addressing Modes on Encryption Algorithms", In Proceedings of the International Conference on Computer Design, Austin, Texas, pp.542-54, 2001.
     Google Scholar
  24. A. Canedo, B. A. Abderazek, S. Masahiro, "Compiling for Reduced Bit-Width Queue Processors," Journal of Signal Processing Systems, Vol. 59, No. 1, pp. 45-55, 2010.
     Google Scholar
  25. S. Flur, K. Gray, C. Pulte, S. Sarkar, A. Sezgin, L. Maranget, W. Deacon, and P. Sewell, ?Modelling the ARMv8 architecture, operationally: concurrency and ISA,? ACM SIGPLAN Notices, pp. 608-621 2016.
     Google Scholar
  26. A. Akram, ?A Study on the Impact of Instruction Set Architectures on Processor?s Performance,? M.S. Thesis, Western Michigan University, 2017.
     Google Scholar
  27. B. Simner, S. Flur, C. Pulte, A. Armstrong, J. Pichon-Pharabod, et al., ?ARMv8-A system semantics: instruction fetch in relaxed architectures,? In Proceedings of 29th European Symposium on Programming (ESOP), Mar 2020.
     Google Scholar
  28. J. Lee, J. Kim, C. Jang, S. Kim, B. Egger, K. Kim, and S. Han, "FaCSim: A Fast and Cycle-Accurate Architecture Simulator for Embedded Systems," In Proceedings of the International Conference on Languages, Compilers, and Tools for Embedded Systems, Tucson, Arizona, USA, pp. 89-100, 2007.
     Google Scholar
  29. ARM Ltd., Cortex-M3 technical reference manual, ARM Ltd., 2010.
     Google Scholar
  30. J. L. Henning, "SPEC CPU 2000: Measuring CPU performance in the new millennium," IEEE Computer, Vol. 33, No. 7, pp. 28-35, 2000.
     Google Scholar
  31. M. Guthaus, J. Ringenberg, D. Ernst, T. Austin, T. Mudge, and R. Brown, "Mibench: A free, commercially representative embedded benchmark suite", In Proceedings of the 4th IEEE International Workshop on the Workload Characterization, Austin, TX, USA, pp.3-14, 2001.
     Google Scholar