Optimizing instruction cache performance for operating system intensive workloads

Author

Torrellas, Josep ; Xia, Chun ; Daigle, Russell

Author_Institution

Center for Supercomput. Res. & Dev., Illinois Univ., Urbana, IL, USA

fYear

1995

fDate

1995

Firstpage

360

Lastpage

369

Abstract

High instruction cache hit rates are key to high performance. One known technique to improve the hit rate of caches is to use an optimizing compiler to minimize cache interference via an improved layout of the code. This technique, however, has been applied to application code only, even though there is evidence that the operating system often uses the cache heavily and with less uniform patterns than applications. Therefore, it is unknown how well existing optimizations perform for systems code and whether better optimizations can be found. We address this problem in this paper. This paper characterizes in detail the locality patterns of the operating system code and shows that there is substantial locality. Unfortunately, caches are not able to extract much of it: rarely-executed special-case code disrupts spatial locality, loops with few iterations that call routines make loop locality hard to exploit, and plenty of loop-less code hampers temporal locality. As a result, interference within popular execution paths dominates instruction cache misses. Based on our observations, we propose an algorithm to expose these localities and reduce interference. For a range of cache sizes, associativities, lines sizes, and other organizations we show that we reduce total instruction miss rates by 31-86% (up to 2.9 absolute points). Using a simple model this corresponds to execution time reductions in the order of 12-26%. In addition, our optimized operating system combines well with optimized or unoptimized applications

Keywords

cache storage; optimising compilers; program compilers; application code; cache interference; interference; locality patterns; operating system intensive workloads; optimizing compiler; optimizing instruction cache performance; temporal locality; total instruction miss rates; Central Processing Unit; Contracts; Control systems; Delay; Interference; NASA; Operating systems; Optimizing compilers; Research and development; Size control;

fLanguage

English

Publisher

ieee

Conference_Titel

High-Performance Computer Architecture, 1995. Proceedings., First IEEE Symposium on

Conference_Location

Raleigh, NC

Print_ISBN

0-8186-6445-2

Type

conf

DOI

10.1109/HPCA.1995.386527

Filename

386527