In order to make use of the ever-improving microprocessor performance, the applications must be modified to take advantage of the parallelism of today’s microprocessors. One such application that needs to be modernized is the weather research and forecasting (WRF) model, which is designed for numerical weather prediction and atmospheric research. The WRF software infrastructure consists of several components such as dynamic solvers and physics schemes. Numerical models are used to resolve the large-scale flow. However, subgrid-scale parameterizations are for an estimation of small-scale properties (e.g., boundary layer turbulence and convection, clouds, radiation). Those have a significant influence on the resolved scale due to the complex nonlinear nature of the atmosphere. For the cloudy planetary boundary layer (PBL), it is fundamental to parameterize vertical turbulent fluxes and subgrid-scale condensation in a realistic manner. A parameterization based on the total energy–mass flux (TEMF) that unifies turbulence and moist convection components produces a better result than other PBL schemes. Thus, we present our optimization results for the TEMF PBL scheme. Those optimizations included vectorization of the code to utilize multiple vector units inside each processor code. The optimizations improved the performance of the original TEMF code on Xeon Phi 7120P by a factor of
. Furthermore, the same optimizations improved the performance of the TEMF on a dual socket configuration of eight-core Intel Xeon E5-2670 CPUs by a factor of
compared to the original TEMF code.