For x86 Architecture there are 3 types of addresses:
- Logical Address:
- This address is address of an instruction or data in machine language
- This address consist of a segment and an offset i.e. distance from segment start address.
- Linear address or Virtual address:
- This address is a binary number in virtual memory that enables a process to use a location in main memory independently of other processes and to use more space than actually exists in primary storage by temporarily relegating some contents to a hard disk or internal flash drive.
- Physical Address:
- Address of the memory cells in RAM of the computer.
Need for Virtual Addressing
- The main memory (RAM) available for a computer is limited.
- Many processes use a common code in libraries.
- Using Virtual addressing, a CPU and Kernel gives an impression to a process that the memory is unlimited.
- Since 2 out of 3 address are virtual mentioned above, there is a need for address translation from Logical to Linear and Linear to Physical address.
- For this reason each CPU contains a hardware names as Memory Management Unit (MMU).
- Segmentation Unit: converts the Logical address to Linear.
- Paging Unit: converts Linear address to Physical.
- The address translation from linear address is done using two translation tables
- Page Directory
- Page Table
Image Credit: https://manybutfinite.com/post/memory-translation-and-segmentation/
Address Translation in Intel x86
Linear-Address Translation to a 4-KByte Page using IA-32e Paging
Formats of CR3 and Paging-Structure Entries with PAE Paging
Image Credit: By Mdjango, Andrew S. Tanenbaum (Own work) [CC BY-SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons
Segmentation and Paging in Linux
- Both segmentation and Paging is redundent and hence Linux use segmentation in limited way.
- Segmentation can assign a different linear address space to each process
- Paging can map the same linear address space into different physical address spaces.
Segments and their usage
- Each segment is described by 8-byte Segment Descriptor.
- These descriptors are defined in Global Descriptor Table (GDT) or in the Local Descriptor Table(LDT).
Image credit: By John Källén (jkl at commons) (Own work) [Public domain], via Wikimedia Commons
Types of Segment Descriptors
- Code Segment Descriptor
- Data Segment Descriptor
- Task State Segment Descriptor
Segmentation in Linux
image credit https://manybutfinite.com/post/anatomy-of-a-program-in-memory/
Data structures for segmentation
- GDT Global Descriptor Table
- There is one GDT per CPU in Linux.
- There are 18 descriptors in each GDT for various purposes as follows:
- Descriptors for Kernel code and User code.
- Task State Segment (TSS)
- default Local Descriptor Table (LDT)
- Thread-Local Storage (TLS) segments
- Advanced Power Management (APM )
- Plug and Play (PnP ) BIOS services
- Special segment used for handling exceptions.
Image Credit: Lars H. Rohwedder (User:RokerHRO – selfmade work
- paging unit translates linear addresses into physical ones
- For efficiency the linear addresses are divided in fixed length intervals called as pages. These continuous linear addresses within a page are mapped into continuous physical addresses.
- Page frames: main memory is divided into fixed lenght page frames. Each page frame contains a page. Page is just block of data in memory or disk.
- page tables: The data structures that map linear to physical addresses.
- Page Size: 4 KB
- Huge Pages: 2 MB and 1 GB.
- Page address (32 bit) = Directory (10 bits) + Page Table (10 bits) + Offset(12 bit)
Physical Address Extension (PAE)
- In order to use more than 4 GB memory Intel started using 36 pins for address translation effectively supporting more addresses.
Caching in Hardware
- There are atleast 3 levels of caches supported in modern microprocessor.
- The caches work on principle of spatial locality and principle of temporal locality.
- Cache is devided into cache line generally of 64 bytes.
- Most of caches are N-way associative.
- Cache unit resides between the paging unit and the main memory.
- It includes both a hardware cache memory and a cache controller.
- Each cache line has a tag and some flags that stores the status of cache line.
- CPUs first look for address in cache before looking into main memory.
- Flushing cache is done using write Back mechanism which is more efficient. Only cache entry is updated by CPU and the main memory is updated eventually.
Translation Lookaside Buffers (TLB)
- This is kind of a cache used for storing recently converted addresses between linear to physical.
- The Address Translation unit first looks in TLB for physical address for given linear address if not found, the hardware goes through page tables to find the page.
Process page tables
- Linux stores pages using 4 levels
- Page Global Directory
- Page Upper Directory
- Page Middle Directory
- Page Table
- Each process has its own Page Global Directory and its own set of Page Tables.
- Linear addresses from 0x00000000 to 0xbfffffff can be addressed when the process runs in either User or Kernel Mode.
- Linear addresses from 0xc0000000 to 0xffffffff can be addressed only when the process runs in Kernel Mode.