Writing A Simple Operating System From Scratch

3m ago
7 Views
0 Downloads
756.29 KB
77 Pages
Transcription

iWriting a Simple Operating System —from ScratchbyNick BlundellSchool of Computer Science, University of Birmingham,UKDraft: December 2, 2010Copyright c 2009–2010 Nick Blundell

ContentsContentsii1Introduction12Computer Architecture and the Boot Process2.1 The Boot Process . . . . . . . . . . . . . . . .2.2 BIOS, Boot Blocks, and the Magic Number .2.3 CPU Emulation . . . . . . . . . . . . . . . . .2.3.1Bochs: A x86 CPU Emulator . . . .2.3.2QEmu . . . . . . . . . . . . . . . . .2.4 The Usefulness of Hexadecimal Notation . . .3.3345666Boot Sector Programming (in 16-bit Real Mode)3.1 Boot Sector Re-visited . . . . . . . . . . . . . . . .3.2 16-bit Real Mode . . . . . . . . . . . . . . . . . . .3.3 Erm, Hello? . . . . . . . . . . . . . . . . . . . . . .3.3.1Interrupts . . . . . . . . . . . . . . . . . .3.3.2CPU Registers . . . . . . . . . . . . . . . .3.3.3Putting it all Together . . . . . . . . . . .3.4 Hello, World! . . . . . . . . . . . . . . . . . . . . .3.4.1Memory, Addresses, and Labels . . . . . .3.4.2’X’ Marks the Spot . . . . . . . . . . . . .Question 1 . . . . . . . . . . . . . . . . . .3.4.3Defining Strings . . . . . . . . . . . . . . .3.4.4Using the Stack . . . . . . . . . . . . . . .Question 2 . . . . . . . . . . . . . . . . . .3.4.5Control Structures . . . . . . . . . . . . .Question 3 . . . . . . . . . . . . . . . . . .3.4.6Calling Functions . . . . . . . . . . . . . .3.4.7Include Files . . . . . . . . . . . . . . . . .3.4.8Putting it all Together . . . . . . . . . . .Question 4 . . . . . . . . . . . . . . . . . .3.4.9Summary . . . . . . . . . . . . . . . . . . .8810101111111313131616171717191921212122ii.

CONTENTS3.5.22232323242728.303132353639Writing, Building, and Loading Your Kernel5.1 Understanding C Compilation . . . . . . . . . . . . . . .5.1.1Generating Raw Machine Code . . . . . . . . .5.1.2Local Variables . . . . . . . . . . . . . . . . . .5.1.3Calling Functions . . . . . . . . . . . . . . . . .5.1.4Pointers, Addresses, and Data . . . . . . . . . .5.2 Executing our Kernel Code . . . . . . . . . . . . . . . .5.2.1Writing our Kernel . . . . . . . . . . . . . . . .5.2.2Creating a Boot Sector to Bootstrap our Kernel5.2.3Finding Our Way into the Kernel . . . . . . . .5.3 Automating Builds with Make . . . . . . . . . . . . . . .5.3.1Organising Our Operating System’s Code Base5.4 C Primer . . . . . . . . . . . . . . . . . . . . . . . . . . .5.4.1The Pre-processor and Directives . . . . . . . .5.4.2Function Declarations and Header Files . . . . .414141444647495050535457595960Developing Essential Device Drivers and a Filesystem6.1 Hardware Input/Output . . . . . . . . . . . . . . . . . .6.1.1I/O Buses . . . . . . . . . . . . . . . . . . . . .6.1.2I/O Programming . . . . . . . . . . . . . . . . .6.1.3Direct Memory Access . . . . . . . . . . . . . .6.2 Screen Driver . . . . . . . . . . . . . . . . . . . . . . . .6.2.1Understanding the Display Device . . . . . . . .6.2.2Basic Screen Driver Implementation . . . . . . .6.2.3Scrolling the Screen . . . . . . . . . . . . . . . .6.3 Handling Interrupts . . . . . . . . . . . . . . . . . . . . .6.4 Keyboard Driver . . . . . . . . . . . . . . . . . . . . . .6.5 Hard-disk Driver . . . . . . . . . . . . . . . . . . . . . .6.6 File System . . . . . . . . . . . . . . . . . . . . . . . . .626263636565656569707070703.64567iiiNurse, Fetch me my Steth-o-scope . . .3.5.1Question 5 (Advanced) . . . . .Reading the Disk . . . . . . . . . . . . .3.6.1Extended Memory Access Using3.6.2How Disk Drives Work . . . . .3.6.3Using BIOS to Read the Disk .3.6.4Putting it all Together . . . . .Entering 32-bit Protected Mode4.1 Adapting to Life Without BIOS . . .4.2 Understanding the Global Descriptor4.3 Defining the GDT in Assembly . . .4.4 Making the Switch . . . . . . . . . .4.5 Putting it all Together . . . . . . . . . . . . . . . . . . . . . . .Segments. . . . . . . . . . . . . . . . . . .Table. . . . . . . . . .Implementing Processes717.1 Single Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 717.2 Multi-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

CONTENTSiv8Summary72Bibliography73

Chapter1IntroductionWe’ve all used an operating system (OS) before (e.g. Windows XP, Linux, etc.), andperhaps we have even written some programs to run on one; but what is an OS actuallythere for? how much of what I see when I use a computer is done by hardware and howmuch is done by software? and how does the computer actually work?The late Prof. Doug Shepherd, a lively teacher of mine at Lancaster University,once reminded me amid my grumbling about some annoying programming problem that,back in the day, before he could even begin to attempt any research, he had to writehis own operating system, from scratch. So it seems that, today, we take a lot forgranted about how these wonderful machines actually work underneith all those layersof software that commonly come bundled with them and which are required for theirday-to-day usefulness.Here, concentrating on the widely used x86 architecture CPU, we will strip bare ourcomputer of all software and follow in Doug’s early footsteps, learning along the wayabout: How a computer boots How to write low-level programs in the barren landscape where no operatingsystem yet exists How to configure the CPU so that we can begin to use its extended functionality How to bootstrap code written in a higher-level language, so that we can reallystart to make some progress towards our own operating system How to create some fundamental operating system services, such as device drivers,file systems, multi-tasking processing.Note that, in terms of practical operating system functionality, this guide does notaim to be extensive, but instead aims to pool together snippets of information frommany sources into a self-contained and coherent document, that will give you a hands-onexperience of low-level programming, how operating systems are written, and the kindof problems they must solve. The approach taken by this guide is unique in that theparticular languages and tools (e.g. assembly, C, Make, etc.) are not the focus butinstead are treated as a means to an end: we will learn what we need to about thesethings to help us achieve our main goal.1

CHAPTER 1. INTRODUCTION2This work is not intended as a replacement but rather as a stepping stone to excellentwork such as the Minix project [?] and to operating system development in general.

Chapter2Computer Architecture and theBoot Process2.1The Boot ProcessNow, we begin our journey.When we reboot our computer, it must start up again, initially without any notion ofan operating system. Somehow, it must load the operating system --- whatever variantthat may be --- from some permanent storage device that is currently attached to thecomputer (e.g. a floppy disk, a hard disk, a USB dongle, etc.).As we will shortly discover, the pre-OS environment of your computer offers little inthe way of rich services: at this stage even a simple file system would be a luxury (e.g.read and write logical files to a disk), but we have none of that. Luckily, what we do haveis the Basic Input/Output Software (BIOS), a collection of software routines that areinitially loaded from a chip into memory and initialised when the computer is switchedon. BIOS provides auto-detection and basic control of your computer’s essential devices,such as the screen, keyboard, and hard disks.After BIOS completes some low-level tests of the hardware, particularly whether ornot the installed memory is working correctly, it must boot the operating system storedon one of your devices. Here, we are reminded, though, that BIOS cannot simply load afile that represents your operating system from a disk, since BIOS has no notion of a filesystem. BIOS must read specific sectors of data (usually 512 bytes in size) from specificphysical locations of the disk devices, such as Cylinder 2, Head 3, Sector 5 (details ofdisk addressing are described later, in Section XXX).So, the easiest place for BIOS to find our OS is in the first sector of one of the disks(i.e. Cylinder 0, Head 0, Sector 0), known as the boot sector. Since some of our disks maynot contain an operating systems (they may simply be connected for additional storage),then it is important that BIOS can determine whether the boot sector of a particulardisk is boot code that is intended for execution or simply data. Note that the CPU doesnot differentiate between code and data: both can be interpreted as CPU instructions,where code is simply instructions that have been crafted by a programmer into someuseful algorithm.3

CHAPTER 2. COMPUTER ARCHITECTURE AND THE BOOTPROCESS4Again, an unsophisticated means is adopted here by BIOS, whereby the last twobytes of an intended boot sector must be set to the magic number 0xaa55. So, BIOSloops through each storage device (e.g. floppy drive, hard disk, CD drive, etc.), readsthe boot sector into memory, and instructs the CPU to begin executing the first bootsector it finds that ends with the magic number.This is where we seize control of the computer.2.2BIOS, Boot Blocks, and the MagicNumberIf we use a binary editor, such as TextPad [?] or GHex [?], that will let us write raw bytevalues to a file --- rather than a standard text editor that will convert characters such as’A’ into ASCII values --- then we can craft ourselves a simple yet valid boot sector.e9 fd ff 00 00 00 00 00 00 00 00 00 00 00 00 0000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00*00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 aaFigure 2.1: A machine code boot sector, with each byte displayed inhexadecimal.Note that, in Figure 2.1, the three important features are: The initial three bytes, in hexadecimal as 0xe9, 0xfd and 0xff, are actuallymachine code instructions, as defined by the CPU manufacturer, to perform anendless jump. The last two bytes, 0x55 and 0xaa, make up the magic number, which tells BIOSthat this is indeed a boot block and not just data that happens to be on a drive’sboot sector. The file is padded with zeros (’*’ indicates zeros omitted for brevity), basically toposition the magic BIOS number at the end of the 512 byte disk sector.An important note on endianness. You might be wondering why the magic BIOSnumber was earlier described as the 16-bit value 0xaa55 but in our boot sector waswritten as the consecutive bytes 0x55 and 0xaa. This is because the x86 architecturehandles multi-byte values in little-endian format, whereby less significant bytes proceedmore significant bytes, which is contrary to our familiar numbering system --- though ifour system ever switched and I had 0000005 in my bank account, I would be able toretire now, and perhaps donate a couple of quid to the needy Ex-millionaires Foundation.Compilers and assemblers can hide many issues of endianness from us by allowingus to define the types of data, such that, say, a 16-bit value is serialised automaticallyinto machine code with its bytes in the correct order. However, it is sometimes useful,

CHAPTER 2. COMPUTER ARCHITECTURE AND THE BOOTPROCESS5especially when looking for bugs, to know exactly where an individual byte will be storedon a storage device or in memory, so endianness is very important.This is possibly the smallest program your computer could run, but it is a validprogram nonetheless, and we can test this in two ways, the second of which is much saferand better suited to our kind of experiments: Using whatever means your current operating system will allow, write this bootblock to the first sector of a non-essential storage device (e.g. floppy disk or flashdrive), then reboot the computer. Use virtual machine software, such as VMWare or VirtualBox, and set the bootblock code as a disk image of a virtual machine, then start-up the virtual machine.You can be sure this code has been loaded and executed if your computer simplyhangs after booting, without a message such as “No operating system found”. This is theinfinite loop at work, that we put at the start of the code. Without this loop the CPUwould tear off, executing every subsequent instruction in memory, most of which willbe random, uninitialised bytes, until it throws itself into some invalid state and eitherreboots or, by chance, stumbles upon and runs a BIOS routine that formats your maindisk.Remember, it is us that program the computer, and the computer follows our instructions blindly, fetching and executing them, until it is switched off; so we need tomake sure that it executes our crafted code rather than random bytes of data held somewhere in memory. At this low level, we have a lot of power and responsibility over ourcomputer, so we need to learn how to control it.2.3CPU EmulationThere is a third, more convenient option for testing these low-level programs withoutcontinuously having to reboot a machine or risk scrubbing your important data off adisk, and that is to use a CPU emulator such as Bochs or QEmu. Unlike machinevirtualisation (e.g. VMware, VirtualBox), which tries to optimise for performance andtherefore usage of the hosted operating system by running guest instructions directly onthe CPU, emulation involves a program that behaves like a specific CPU architecture,using