Somewhat presentable I guess?

This commit is contained in:
Zhengyi Chen 2024-01-31 13:12:41 +00:00
parent a7eb8e8214
commit b8904cd213
2 changed files with 126 additions and 10 deletions

Binary file not shown.

View file

@ -108,7 +108,13 @@
synchronization with CPU cache. synchronization with CPU cache.
} }
\item { \item {
We cannot assume MMU to magically automatically maintain coherence. We cannot assume MMU to magically maintain coherence.
\begin{itemize}
\item {
This seems the case for x86\_64 (cache-coherent DMA), but
not ARM64.
}
\end{itemize}
} }
\item { \item {
At transportation time: At transportation time:
@ -379,7 +385,7 @@
\section{3. Progress} \section{3. Progress}
\begin{frame} \begin{frame}
\frametitle{Progress} \frametitle{3. Progress}
\begin{itemize} \begin{itemize}
\item { \item {
Goal: in-kernel implementation of software cache-coherency via Goal: in-kernel implementation of software cache-coherency via
@ -408,10 +414,35 @@
ARMv8 defines two levels of cache coherence: ARMv8 defines two levels of cache coherence:
\begin{itemize} \begin{itemize}
\item { \item {
\textit{Point-of-Unification}: \textit{Point-of-Unification}: Within a core, instruction
cache, data cache, and TLB all agree in the copy seen for a
particular address.
\begin{itemize}
\item Notably, changing PTE requires PoU.
\end{itemize}
} }
\item { \item {
\textit{Point-of-Coherence}: \textit{Point-of-Coherence}: Between all DMA-capable
peripherals (CPU or otherwise), they all agree in the copy
seen for a particular address.
}
\end{itemize}
For this thesis's purposes, strive for PoC.
}
\item {
Operations to achieve the latter are encapsulated in the Linux
kernel as \texttt{(d|i)cache\_(clean|inval)\_poc}.
\begin{itemize}
\item Declared under \texttt{arch/arm64/include/asm/cacheflush.h}.
\item Defined in \texttt{arch/arm64/mm/cache.S}.
\item {
Takes virtual address wrt. \textit{current} address space to
writeback/invalidate cache entries.
}
\item {
Problem: Can only be called in process context (for userspace
virtual addresses) or in all contexts
(for kernel virtual addresses)?
} }
\end{itemize} \end{itemize}
} }
@ -420,21 +451,106 @@
\begin{frame} \begin{frame}
\frametitle{Kernel Patch for On-demand Coherency} \frametitle{Kernel Patch for On-demand Coherency}
\begin{itemize}
\item {
Problem: These symbols are not exported -- not intended for driver
use.
}
\item {
Temporary solution: re-export them via patching the kernel.
\begin{itemize}
\item Note: Kernel version v6.7.0
\item {
Longish-term solution: arrange kernel module code in a way
that takes advantage of existing driver API
(e.g., via DMA API, for example \textit{smbdirect}).
}
\end{itemize}
}
\item {
Implements wrapper function \texttt{\_\_dcache\_clean\_poc} to
re-export \texttt{dcache\_clean\_poc} into driver namespace.
}
\item {
Exports symbol into separate header file.
}
\end{itemize}
\end{frame} \end{frame}
\begin{frame} \begin{frame}
\frametitle{Proof-of-Concept Kernel Module} \frametitle{Proof-of-Concept Kernel Module}
\begin{itemize}
\item {
Dynamically allocates \texttt{GFP\_USER} pages and remaps to
userspace on \texttt{mmap}.
\begin{itemize}
\item {
\texttt{GFP\_USER} so (for convenience) pages can be
directly addressable in kernelspace (via kernel page table).
}
\item {
Pages are lazily allocated and shared between multiple
processes (i.e., user address spaces).
}
\item {
Exposed as character device \texttt{/dev/my\_shmem}.
}
\end{itemize}
}
\item Around 300+ LoC.
\item {
Problem: flawed premise for testing cache writeback!
\begin{itemize}
\item {
Summary: CPU datapath differs from DMA datapath, common cache
coherency maintenance operations are already performed
in common file/virtual memory area operation code.
}
\item {
Idea: perform cache write-back on \texttt{vm\_ops->close}.
}
\item {
Reality: virtual memory area already cleaned from cache and
removed from address space prior to calling
\texttt{vm\_ops->close}.
}
\item {
Fix: Implement custom \texttt{ioctl}?
}
\end{itemize}
}
\end{itemize}
\end{frame} \end{frame}
% Part 4: Future Work % Part 4: Future Work
% ============================================================================= % =============================================================================
\section{4. Future Work}
\begin{frame}
\frametitle{4. Future Work}
\begin{enumerate}
\item {
Incorporate cache coherence mechanism into the larger project.
}
\item {
Implement memory model within the larger project. This involves:
\begin{itemize}
\item {
Making adjustment to message type and structure specifications
for better inter-operation with RDMA.
}
\item {
Implement memory model programmatically.
}
\end{itemize}
}
\end{enumerate}
\end{frame}
% References % References
\begin{frame}
\frametitle{References}
\printbibliography
\end{frame}
\end{document} \end{document}