%%% Preamble
\documentclass[paper=a4, fontsize=12pt]{article}
% scrreprt // report // article
\usepackage[english]{babel}
\usepackage[utf8]{inputenc}
\usepackage{amsmath,amsfonts,amsthm} % Math packages
\usepackage{graphicx}
\usepackage{multicol}
\usepackage{amsmath}
\usepackage{scrextend}
\usepackage[colorinlistoftodos]{todonotes}
\usepackage{blindtext}
\newcommand*{\Scale}[2][4]{\scalebox{#1}{$#2$}}%
\newcommand*{\Resize}[2]{\resizebox{#1}{!}{$#2$}}%
\usepackage{enumitem}
\usepackage{listingsutf8}
\usepackage{soul}
\usepackage{float}
\usepackage{epstopdf}
\usepackage{subfig}
\usepackage{amssymb}
\setlist{leftmargin=5.5mm}
\usepackage[margin=1.1in]{geometry}
\usepackage[all]{xy}
\usepackage{pdflscape}
\usepackage{longtable}
\usepackage{cite}
\usepackage[hyphens]{url}
\usepackage[hidelinks]{hyperref}
\hypersetup{breaklinks=true}
\urlstyle{same}
\usepackage{setspace}
\setlength{\intextsep}{3mm}
%add gif
\epstopdfDeclareGraphicsRule{.gif}{png}{.png}{convert gif:#1 png:\OutputFile}
\AppendGraphicsExtensions{.gif}
%\usepackage{titling}
%\newcommand{\subtitle}[1]{%
% \posttitle{%
% \par\end{center}
% \begin{center}\large#1\end{center}
% \vskip0.5em}%
%}
%\title{Fancy title}
%Fingerprint recognition: Enhancement\\ and minutae extraction\\Recognition of identity by fingerprints images \\ }
%\subtitle{Statistical Image Analysis}
%\author{\small Karen Ivette Baca Mendoza, 940524-C101, karenb@student.chalmers.se \\
%\small Lucia Diego, 921212-2908, luciad@student.chalmers.se \\
%\small Oscar Zapata Buenrostro, 931126-C264, zoscar@student.chalmers.se}
%\date{\today}
\begin{document}
\begin{titlepage}
\newcommand{\HRule}{\rule{\linewidth}{0.5mm}} % Defines a new command for the horizontal lines, change thickness here
\center % Center everything on the page
%----------------------------------------------------------------------------------------
% HEADING SECTIONS
%----------------------------------------------------------------------------------------
\textsc{\LARGE Tecnológico de Monterrey}\\[1.3cm] % Name of your university/college
\textsc{\Large Operating Systems Lecture }\\[0.6cm] % Major heading such as course name
\textsc{\large (TC2008)}\\[0.6cm] % Minor heading such as course title
\HRule \\[0.4cm]
{ \huge \bfseries Modifying Linux Kernel}\\[0.4cm] % Title of your document
\HRule \\[1.5cm]
\begin{minipage}{0.4\textwidth}
\begin{flushleft} \large
\emph{Authors:}\\[.5cm]
Adair Ibarra Bautista
\\[.5cm]
Oscar Arturo Zapata Buenrostro\\[.5cm]
\end{flushleft}
\end{minipage}
~
\begin{minipage}{0.4\textwidth}
\begin{flushright} \large
\emph{Professor:} \\
Victor Rodríguez Bahena \end{flushright}
\end{minipage}\\[3cm]
\end{titlepage}
%\section{dhish}
%\subsection{hcdij}
%\subsubsection{jdhvdjh}
%\paragraph{djhcjdh}
%\begin{figure}[H]
%\centering
%\includegraphics[width=1\textwidt%h]{diagram2.png}
%\caption{\label{fig:xps22}Methodo%logy diagram}
%\end{figure}
\newpage
\textsc{\textbf{Abstract}}
In this document we will focus on modifying the Linux Kernel through memory and scheduler parameters. The main objective is to study the performance of a computer during the execution of AIO-Stress Benchmark. It was necessary to run the test several times since three of the parameter mentioned in this project were modified 5 times. After completing the test, the results were displayed on graphs, showing that all the variables have a noticeable influence on the performance of the computer.
\newpage
\tableofcontents
\newpage
%\tableofcontents
\begin{multicols}{2}
\section{Introduction}
The AIO-Stress Test is a simple workload generator for systems. It imposes a configurable amount of CPU, memory, I/O, and disk stress on the computer. This test allowed us to determine which parameter has a greater influence on the performance of the computer and helped us to define how the computer behaves under specific circumstances.
The purpose of this paper is to present a case study to explore the benefits or disadvantages of modifying kernel and memory parameters of the scheduler (commonly known as "scheduler tuning"). In order to accomplish this, we will use an AIO Stress tool for our benchmarks and experiments
\section{\large Theoretical Framework}
Scheduling policies are divided into two major categories:
\begin{itemize}
\item Real time policies
\begin{enumerate}
\item SCHED-FIFO
\item SCHED-RR
\end{enumerate}
\item Normal policies
\end{itemize}
\subsection{Real time policies}
Real time threads are scheduled first, and normal threads are scheduled after all Real-time threads have been scheduled. The Real time policies are used for time-critical tasks that must complete without interruptions.\cite{1}
\subsubsection{SCHED-FIFO policy}
This policy is also referred to as static priority scheduling, because it defines a fixed priority (between 1 and 99) for each thread. The scheduler scans a list of SCHED-FIFO
threads in priority order and schedules the highest priority thread that is ready to run. This thread runs until it blocks, exits, or is preempted by a higher priority thread that is ready to run.
In the Linux kernel, the SCHED-FIFO policy includes a bandwidth cap mechanism. This protects Real time application programmers from Real-time tasks that might monopolize the CPU.\cite{1}
\subsubsection{SCHED-RR policy}
SCHED-RR is a round-robin variant of the SCHED-FIFO policy. SCHED-RR threads are also given a fixed priority between 1 and 99. However, threads with the same priority are scheduled round-robin style within a certain quantum, or time slice.
\subsection{Scheduler tuning}
The scheduler offers a number of parameters which allow to tune its behavior to actual needs:
\begin{itemize}
\item \textit{sched rt runtime us}: The maximum CPU time that can be used by all the real-time tasks (1 second by default). When this amount of time is used up these tasks must wait for \textit{sched rt period us} before the are allowed to be executed again.\cite{2}
\item \textit{sched rt period us}: The scheduler waits this amount of time (0.95 s by default) before scheduling any of the real-time tasks again. \cite{2}
\item \textit{sched min granularity ns}: This parameter decides the minimum time a task will be be allowed to run on CPU before being pre-empted out.
By default, it is set to 4ms. So by default, any task will run at least 4ms before getting pre-empted out. \cite{3}
\item \textit{sched latency ns}: This parameter, together with \textit{sched min granularity ns}, decides the scheduler period, the period in which all run queue tasks are scheduled at least once. \cite{3}
\item \textit{sched migration cost ns}: Determines how long a migrated process has to be running before the kernel will consider migrating it again to another core.
\item \textit{sched nr migrate}: This option can be set to specify the number of tasks that will move at a time.
\item \textit{sched rr timeslice ms}: Threads that have the same priority are scheduled round-robin style within a certain time slice. You can set the value of this time slice in milliseconds with the \textit{sched rr timeslice ms}.
\item \textit{Swappiness}: Is a parameter that controls the relative weight given to swapping out runtime memory, as opposed to dropping pages from the system page cache. Swappiness can be set to values between 0 and 100 inclusive. A low value causes the kernel to avoid swapping, a higher value causes the kernel to try to use swap space.
\end{itemize}
\section{Objective}
The objective of this project is to "get our hands dirty" modifying real variables of a real operating system and see how this changes actually affect the performance of a given process or service of the operating system.
Along all this, it is important to know how benchmarks are done and how to read the results, including taking decisions about the best values that can improve the operating system.
\section{Methodology}
In this project we modified three different scheduler variables to observe the effect this variables have performance-wise. The variables we changed are: vm.swappiness, kernel.sched-latency-ns and finally, kernel.sched-rt-runtime-us.
The benchmark we ran to test this values is phoronix AIO-Stress.
The process followed for this tests was giving five different values to each variable, leaving all the scheduler variables with their default values (even the ones being evaluated in this project) and running the test. Once finished with the five tests, we analyzed the graphs produced by phoronix.
The detailed information about the values given to each variable will be seen in the following sections.
All this tests were performed in a laptop computer. The technical specifications are shown in Figure \ref{fig:PC}.
\begin{figure}[H]
\centering
\includegraphics[width=0.5\textwidth]{CaractPC.png}
\caption{\label{fig:PC} Technical specifications of the computer}
\end{figure}
\subsection{Swappiness}
This was the first variable we modified. We changed the value of the variable and then, ran the test. We repeated this process five times.
The default value of this variable is 60. The possible values for the scheduler swappiness ranges from 0 to 100.
Maintaining the default values for the scheduler variables, we gave to the swappiness the values seen in Table \ref{table:table1}.
\begin{table}[H]
\centering
\begin{tabular}{|l|}
\hline
\textbf{Value} \\ \hline
20\% \\ \hline
40\% \\ \hline
60\%* \\ \hline
80\% \\ \hline
100\% \\ \hline
\end{tabular}
\caption{Values given to the swappiness variable}
\label{table:table1}
\end{table}
\subsection{Latency}
Returning the swappiness variable to its default value, we then changed the values of the latency time.
The default value for this variable is 24ms.
The values given to this variable for the AIO-Stress benchmark can be seen in Table \ref{table:table2}.
\begin{table}[H]
\centering
\begin{tabular}{|l|}
\hline
\textbf{Value} \\ \hline
6ms \\ \hline
12ms \\ \hline
24ms* \\ \hline
48ms \\ \hline
72ms \\ \hline
\end{tabular}
\caption{Values given to the latency variable}
\label{table:table2}
\end{table}
\subsection{Runtime}
The last variable we changed is the runtime variable. The values this variable can take range from 0 to 1'000,000 [$\mu$s]. The default value is 950,000$\mu$s.
The values that were used for the test are shown in \ref{table:table2}
\begin{table}[H]
\centering
\begin{tabular}{|l|}
\hline
\textbf{Value} \\ \hline
0.01s \\ \hline
0.25s \\ \hline
0.50s \\ \hline
0.75s \\ \hline
0.95s* \\ \hline
\end{tabular}
\caption{Values given to the runtime variable}
\label{table:table3}
\end{table}
\section{Results}
Once done with all the benchmarks, phoronix helped graphing the results and giving some interesting overviews about the tests.
Having these graphs, and analyzing them we couldn't find any patterns about the results. We found out that there is no sure result about this test.
In the following sections the results will be explained along with their graphs and a result overview provided by phoronix. Since we are using the AIO-Stress benchmark it is important to know that all the results are given in MB/s.
For all the results in this test, the greater the number is, the better.
\subsection{Swappiness}
In Figure \ref{fig:Gswap} we have a bar chart representing the performance of the computer in this benchmark for all the different runs. This chart is quite useful given that you can see more easily the difference between each run.
With the default value of swappiness (60), we get a data transmission rate of 381.88MB/s.
\begin{figure}[H]
\centering
\includegraphics[width=0.5\textwidth]{Gswap.png}
\caption{\label{fig:Gswap} Bar chart of performance for swappiness variable.}
\end{figure}
In Figure \ref{fig:ROswap} we can see that the best variable value for swappiness is 40, giving a transmission rate of 386.20MB/s. While the worst value is a swappiness of 20 with a transmission rate of 369.41, 12.47MB/s less than the default value.
Recalling that the swappiness of the scheduler is "how much" swapping between the ROM and RAM is performed between processes, we can get from this test that a good swapping percentage is 40\%. This means, trying to maintain as much information possible in RAM, in order to avoid access to ROM but not too much information.
\begin{figure}[H]
\centering
\includegraphics[width=0.5\textwidth]{ROswap.png}
\caption{\label{fig:ROswap} Result overview of different runs.}
\end{figure}
\subsection{Latency}
Regarding to latency, at first, we were thinking that the results of this test were going to be quite obvious. Of course with less latency everything would work better, right?
Wrong.
With a default latency value of 24ms, we started the benchmark. Later on we started increasing this value just to prove that our theory was right. The surprise came when we started to decrease this value. In Figure \ref{fig:Glat} we can see the bar chart for this test, Even if we decreased the value of latency it didn't improve.
\begin{figure}[H]
\centering
\includegraphics[width=0.5\textwidth]{Glatency.png}
\caption{\label{fig:Glat} Bar chart of performance for latency variable.}
\end{figure}
In Figure \ref{fig:ROlat} we can see that the best result (and for a big difference) is the default value of 24ms. The second best is a value of 48ms with a difference of 31.52MB/s.
\begin{figure}[H]
\centering
\includegraphics[width=0.5\textwidth]{ROlatency.png}
\caption{\label{fig:ROlat} Result overview of different runs.}
\end{figure}
So with the results from this test, we realize that the system actually needs latency to be able to work properly or with a good performance. And that not necessarily has to be the smallest number possible, but just the right number that allows the system to work as it should. Maybe, reducing the delay actually makes some processes slower and that impacts to the data transmission rate.
\subsection{Runtime}
Finally, we put to test the kernel.sched-rt-runtime-us variable. This test is the one that generated a smaller difference between the values given to the variable;but still, we found a value that improves the performance.
\begin{figure}[H]
\centering
\includegraphics[width=0.5\textwidth]{Gruntime.png}
\caption{\label{fig:Grun} Comparison graph of performance obtained with each test run.}
\end{figure}
The improvement in with this variable, was only of 3.28MB/s. To get this improvement, we changed from the default runtime value (0.95s) to 0.50s. We also found another value (0.25s) that performs better than the default, this value gives an improvement of 0.58MB/s.
The worst value in this test was 0.10s.
\begin{figure}[H]
\centering
\includegraphics[width=0.5\textwidth]{ROruntime.png}
\caption{\label{fig:ROruntime} Result overview of different runs.}
\end{figure}
From this last test we get that we can reduce almost half a second the time that we are letting real-time processes to be executed in the CPU. For the AIO-Stress test, we give this processes more time than they need, so reducing this variable improves the data transmission.
\section{Conclusion}
It is very important to keep in mind that these results are only for the AIO-Stress benchmark. These results doesn't mean that changing the values of the variables for the ones we got, will improve the operative system. These results are only dependent of I/O operations.
This analysis is very useful for very specific situations where you want a specific process or service of the operative system to be faster either in a server, a micro-controller or even a personal computer. For this situations, you will have to look for the benchmark related to the service you want to improve and perform several test runs with a lot of different values to get the optimum combination of values in the variables.
Although some improvements can be not very significant, for a purpose-specific application, a difference of 3.28MB/s can be a transcendental difference.
For future work, it would be interesting to keep looking for the best value for the different variables, considering that in this tests we left a big gap between each test run. However We have a good reference to start in order to keep improving data transmission for this benchmark.
Another interesting thing that could be done is analyze if the individual values that we got, actually improve when they are all put together. Or if the combination of the values that we got is worse than the default values.
\end{multicols}
\newpage
\begin{thebibliography}{X}
\bibitem{1}(2016). CPU Scheduling. \url{ https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-cpu-scheduler.html}
\bibitem{2} Kobus, J. Szklarski, R. (2016). Completely Fair Scheduler and its tuning (1st ed.). \url{https://www.fizyka.umk.pl/~jkob/prace-mag/cfs-tuning.pdf}
\bibitem{3} Linux Scheduler. (2012). Oakbytes. \url{https://oakbytes.wordpress.com/2012/06/06/linux-scheduler-cfs-and-latency/}
\end{thebibliography}
\end{document}