X2000: Difference between revisions

Latest revision as of 10:40, 7 October 2024

Use device registers to control I/O pins and light LEDs.

Files

x08-heart:
`heart.c`	Multiplexed heart animation
`hardware.h`	Header file with layout of I/O registers
`startup.c`	Startup code
`device.ld`	Linker script
`Makefile`	Build script
`x08.geany`	Geany project file

Demonstration

Compile and upload the program in directory x2000-heart. The program uses the General Purpose Input/Output (GPIO) pins of the micro:bit to control the 25 on-board LEDs in a flashing heart pattern.

Activity

The program contains a subroutine show(image, time) that displays an image for a certain time, expressed in units of 15 milliseconds. After some initialisation, the main program init() contains a loop that shows the same sequence of images over and over again.

/* init -- main program */
void init(void)
{
    ...

    while (1) {
        show(heart, 70);
        show(small, 10);
        show(heart, 10);
        show(small, 10);
    }
}

The two images, heart and small, are specified in a way that will be explained shortly, but as an initial experiment, you could try modifying this code that uses them, without changing the images themselves.

The times, 70+10+10+10, add up to 100 units, so ought to take 1.5 seconds for each repetition of the pattern. Is this time accurate?
Try modifying the times so that the flashing pattern goes at twice the speed.
Change the sequence of calls so that the heart pulses three times in each repetition of the pattern rather than just twice.

Background

This experiment moves beyond the core of the processor (designed by ARM) to look at I/O devices (designed by Nordic Semiconductor and documented in their separate datasheet). The simplest I/O device is called GPIO (General Purpose Input/Output) and allows 32 pins of the chip to be individually configured either as outputs, producing a voltage under control of the program, or as inputs, so that the program can sense whether an externally applied voltage on the pin is high or low. On the micro:bit, we can put the GPIO pins to immediate use because some of them are connected to 25 LEDs and two push-buttons.

Basics of LEDs

LEDs, like most semiconductor diodes, conduct in only one direction, shown by the arrow on the circuit symbol. Thus, the circuit on the left lights the LED because current can flow through the LED from the positive to the negative terminal of the battery, in the direction shown by the arrow that forms part of the diode symbol. On the other hand, in the circuit on the right, no current will flow, and the LED will produce no light. It is necessary to connect a resistor in series with the LED in order to control the current. For a 3V supply, about 2V will be dropped across the LED, the remaining 1V will be dropped across the resistor, and a resistor of 220Ω will give a sensible current of 4.5mA through resistor and LED. It doesn't matter which side of the LED the resistor is connected, so long as it is there.

A matrix of LEDs

A single GPIO pin can control an LED connected (with a series resistor) between the pin and ground: this is useful as a debugging aid in many prototypes. Alternatively, an LED and resistor could be connected between two GPIO pins A and B, and would light only if A is high and B is low. This is a useless idea, until we see that multiple LEDs can be connected in this way, as in the picture at the right. With these connections, the fourth LED from the left will light if B is high and Y is low; we can prevent the other LEDs from lighting by setting A low and X and Y high. In fact we can light any single LED by setting the pins appropriately, and we can light one group of three LEDs or the other in any pattern by setting either A or B high, and choosing to set X, Y and Z low in a pattern that corresponds to the LEDs we want to light. The series resistors at X, Y and Z control the current through individual LEDs in the setup just mentioned, so that each lit LED receives the same current whether the others are lit or not. To show a pattern on all six LEDs, we will have to show the two groups alternately, changing between them quickly enough that the flickering is invisible. This "matrix" of six LEDs is the smallest that gives any saving over wiring the LEDs to individual I/O pins, since it uses five pins rather than six.

On the V1 micro:bit, though they are arranged physically in a 5x5 array, the LEDs are wired as three 'rows' of nine LEDs each (with two missing), with a documented correspondence to the physical layout.

micro:bit LEDs and buttons

Bits 4–12 of the GPIO register are used for the column lines, and bits 13–15 for the row lines, so the layout of the GPIO register looks like this:

0000|0000|0000|0000|0000|R3 R2 R1 C9|C8 C7 C6 C5|C4 C3 C2 C1|0000

We will use this layout later to work out what patterns of bits to use for various images.

Context

Ad-hoc matrices of LEDs like the one on the micro:bit seem a bit amateurish (though fun). But the idea of LEDs addressed as a matrix is very common, because the 7-segment LED displays seen everywhere are usually built in this way. Each individual digit has seven (or eight with the decimal point) individual anodes and a common cathode. In a multi-digit display, we can connect together corresponding anodes from each digit, and connect these through series resistors to eight GPIO pins. The common cathode for each display also gets its own GPIO pin, so the total number of pins needed is 8 plus the number of digits. In simple designs, the display can be multiplexed (as on the micro:bit) by bit-banging in software. But there are also special-purpose LED driver chips like the HT16K33 that do the same job in hardware, reducing the load on the processor.

Device registers

On the ARM, I/O devices appear as special storage locations (called device registers) in the same address space as RAM and Flash. Loading from or storing to one of these locations gives information about the external world, or has a side-effect that is externally visible. For example, there are two storage locations, one at address 0x50000514 that we shall call GPIO.DIR, and another at address 0x50000504 that we shall call GPIO.OUT. Storing a bit pattern in GPIO.DIR lets us configure the individual pins as inputs or outputs, and storing to GPIO.OUT sets the voltage on each output pin to high (if 1) or low (if 0). We can set these locations like any other, using an str instruction with a specific address; so to set the GPIO.DIR register to the constant 0xfff0 (thereby configuring 12 of the GPIO pins as outputs), we can use the code

ldr r0, =0x50000514
ldr r1, =0xfff0
str r1, [r0]

In a C program, things are a bit easier. There's a header file hardware.h that contains definitions of all the device registers we shall use in the course; including it allows us to use GPIO.DIR in a program to denote the device register with address 0x50000514, and we can store the constant 0xfff0 in that register just by writing

GPIO.DIR = 0xfff0;

The compiler translates this into exactly the code shown earlier.

The GPIO.DIR register has a row of latches that remember whether each GPIO pin is configured as an input (0) or an output (1), and the GPIO.OUT register has a row of latches that remember the output voltage that has been selected for each pin, either low (0) or high (1). Internally, the same electronic signals that for an ordinary RAM location would cause the RAM to update the contents of a word in this case sets the latches and therefore the output directions or values. (On other architectures such as the x86, I/O devices usually do not appear as storage locations accessed with the usual load and store instructions, but there are separate in and out instructions, and effectively a separate address space.) To use the LEDs, we first need to configure the GPIO port so that the relevant bits are outputs: this is achieved with the assignment GPIO.DIR = 0xfff0.

To light a single LED, we should connect its row to +3.3V and its column to 0V. Then current can flow through from the row pin, through the LED and its series resistor, to the column pin, and the LED will light up. The series resistor limits the current that can flow through the LED to a value that is safe both for the LED and the microcontroller pins that are connected to it. If we set the other row lines, apart from the one belonging to our chosen LED, to 0V and the other column lines to +3.3V, then other LEDs in the same row or column as the lit LED will have their anode and cathode at the same potential, and will carry no current. LEDs that are not in the same row or column will have their anode and 0V and their cathode at +3.3V, so they will be reverse biassed, and they also will carry no current, and only the one chosen LED will light. So to light the middle LED in the 5x5 array, which is electrically in row 2 and column 3, we set GPIO.OUT like this:

R3 R2 R1 C9 |C8 C7 C6 C5 |C4 C3 C2 C1 | 0 0 0 0
 0  1  0  1 | 1  1  1  1 | 1  0  1  1 | 0 0 0 0
      5            f            b           0

The needed assignment is GPIO.OUT = 0x5fb0.

micro:bit version 2

The V2 micro:bit has the LEDs and the GPIO registers arranged in a different way. Instead of being wired in three rows of nine, the LEDs are wired in a more logical ways as five rows of five, using ten GPIO pins in all. The V1 microcontroller has 32 GPIO pins arranged as a single 32-bit word, but the microcontroller on the V2 board has more, and in place of the single register GPIO.OUT there are two registers GPIO0.OUT and GPIO1.OUT (and similarly two direction registers GPIO0.DIR and GPIO1.DIR). Irritatingly, the ten pins needed for the LEDs are not arranged contiguously, but spread across both GPIO registers, so that lighting the central LED requires the two assignments,

GPIO0.OUT = 0x50008800;
GPIO1.OUT = 0x00000020;

Further details are given below.

Multiplexing

We could light all 25 LEDs at once by setting all the column lines high and all the rows low – 0xe000. Each individual LED would be a bit dimmer, because the current available in each column (set by the 220Ω series resistor) would be shared among three LEDs. For most patterns, however, we will need to multiplex the three rows, lighting the correct LEDs in each row in turn, and pausing a bit before moving on the next row. If the pauses are short enough, persistence of vision will make it seem that all the LEDs making up the pattern are lit together. For example, a heart pattern

. X . X .
X X X X X
X X X X X
. X X X .
. . X . .

is made (according to the map shown above) by lighting LEDs 5, 6, 7, 9 in row 1, LEDs 1, 2, 3, 4, 5 in row 2, and LEDs 1, 4, 5, 6, 7, 8, 9 in row 3, so we want to use the bit patterns

0010 1000 1111 0000 = 0x28f0
0101 1110 0000 0000 = 0x5e00
1000 0000 0110 0000 = 0x8060

in succession. In each case, one of the column bits is 1, and some of the row bits are 0 according to which LEDs we want to light in that column. Suitable code might be

while (1) {
    GPIO.OUT = 0x28f0;
    delay(JIFFY);
    GPIO.OUT = 0x5e00;
    delay(JIFFY);
    GPIO.OUT = 0x8060;
    delay(JIFFY);
}

The delay() subroutine accepts a time in microseconds, so a suitable value of the constant JIFFY would be 5000, for a delay of 5 milliseconds. Then an iteration of the loop will take just about 15 milliseconds, or about 66 frames per second, fast enough that no flickering will be visible.

If we want to display more than one image from the program, a better plan is to represent each image by an array, and write a subroutine that can display image data from an array for a specified time. We can define a fixed array heart like this, containing the three rows of image data given above.

const unsigned heart[] = {
    0x28f0, 0x5e00, 0x8060
};

Using the keyword const makes heart into a constant array, which the C compiler will arrange to store in ROM space.

Here is the subroutine show() that takes such an array and an integer count as parameters, and displays the three rows of the image successively, repeating the specified number of times.

/* show -- display three rows of a picture n times */
void show(const unsigned img[], int n)
{
    while (n-- > 0) {
        /* Takes 15msec per iteration */
        for (int p = 0; p < 3; p++) {
            GPIO.OUT = img[p];
            delay(JIFFY);
        }
    }
}

The subroutine receives as img a pointer to the array of image data. The body of the subroutine contains a nested loop, with the outer loop repeating n times, and the inner loop showing the three rows of the image in turn, pausing briefly between each row and the next, taking a total of 15n milliseconds.

Calculating images manually like we did earlier is worth doing once or twice to see how it works, but obviously the process is one that could be automated. The header file hardware.h in fact defines a macro IMAGE that can be used like this to define any fixed image.

const unsigned square[] =
    IMAGE(1,1,1,1,1,
          1,0,0,0,1,
          1,0,0,0,1,
          1,0,0,0,1,
          1,1,1,1,1);

micro:bit version 2

The different wiring of the V2 board means that two words of data are needed for each of five rows of LEDs, so we can make an array of ten 32-bit integers to represent the whole image. While the LEDs have a more logical wiring layout, the assignment of GPIO pins to the signals is more haphazard, so calculating the hexadecimal constants needed for an image is no less tedious than for the V1 board. This definition of heart has comments showing the layout of the row and column pins, with two hexadecimal constants following each pair of rows. As before, each frame of the display (now two integers) has a 1 bit for exactly one row pin, and 1 bits for those column pins that correspond to unlit LEDs.

const unsigned heart[] = {
/*     31       27      23      19      15      11      7       3     0
GPIO0: C3 C5.C1|. . .R4|.R2 R1.|R5. . .|R3. . .|C2. . .|. . . .|. . . .
GPIO1:  . . . .|. . . .|. . . .|. . . .|. . . .|. . . .|. .C4 .|. . . . */

     /* 1 1 0 1|0 0 0 0|0 0 1 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0
        0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0 */
        0xd0200000, 0x00000000,
     /* 0 0 0 0|0 0 0 0|0 1 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0
        0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0 */
        0x00400000, 0x00000000,
     /* 0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|1 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0
        0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0 */
        0x00008000, 0x00000000,
     /* 0 1 0 1|0 0 0 1|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0
        0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0 */
        0x51000000, 0x00000000,
     /* 0 1 0 1|0 0 0 0|0 0 0 0|1 0 0 0|0 0 0 0|1 0 0 0|0 0 0 0|0 0 0 0
        0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 1 0|0 0 0 0 */
        0x50080800, 0x00000020
};

Thankfully, once we have seen that this calculation can be done, we can forever after rely on a suitable definition of the IMAGE(...) macro to compute all the images we need. In the version of hardware.h intended for the V2 board, the macro is defined differently, so that it produces the correct array of ten integers for use with show() and similar functions.

micro:bit version 2

The show() function for V2 is similar to the function for V1, but it deals in five rows with two GPIO registers to be set for each row.

/* show -- display five rows of a picture n times */
void show(const unsigned img[], int n)
{
    while (n-- > 0) {
        /* Takes 15msec per iteration */
        for (int p = 0; p < 10; p += 2) {
            GPIO0.OUT = img[p];
            GPIO1.OUT = img[p+1];
            delay(JIFFY);
        }
    }
}

To get the same total time of 15 milliseconds per iteration, we can make JIFFY be 3000 instead of 5000.

What about those delays? The easiest way of implementing them is to write a loop that counts down from some constant, creating a predictable delay. With a bit of care, we can write a loop that takes 500 nsec per iteration, to do delay d microseconds we need to make it iterate 2d times.

void delay(unsigned usec)
{
    unsigned n = 2*usec;
    while (n > 0) {
        nop(); nop(); nop();
        n--;
    }
}

Looking at the compiler output (and timing it with an oscilloscope) reveals that the loop takes 5 cycles per iteration without the nop instructions, so adding three of them brings it up to 8 cycles per iteration, or 500 nsec at 16 MHz. Delays were commonly implemented like this in old-fashioned MS-DOS games, and the games become unplayable when they were moved to a machine with a faster clock than the original PC. The same thing happens to us for the V2 micro:bit, where the clock runs at 64 MHz.

A more serious problem with delay loops is that they force the machine to do nothing useful during the delay. That is a problem we will solve later – by using a hardware timer in place of the delay loop, then making the program interrupt controlled, and ultimately by introducing an operating system that is able to schedule other processes to run while waiting for the timer to fire.

micro:bit version 2

On the V2, delay loops like this are made more complicated by variations in the time it takes to execute an instruction. The processor clock ticks at 64 MHz, rather than the 16 MHz on the V1, and the flash memory where the program is stored can't feed instructions to the processor core fast enough to keep up with this rate. This means that there are sometimes delays between instructions while the processor core waits to fetch code from flash memory, and that makes it difficult to calculate the exact number of clock cycles that a sequence of instructions will take. The poor flash performance can be hidden to some extent by activating a small, RAM-based cache between the processor core and the flash. Although this speeds things up on average, it also makes the timing still more unpredictable. The best solution to this problem is to avoid delay loops entirely; but in the interim, we can improve matters by arranging to copy the code for timing-critical subroutines like delay into RAM when the program starts. The experimental setup is arranged so that this will happen automatically if the magic word CODERAM appears at the start of the subroutine.

The push-buttons on the micro:bit are connected to other GPIO pins that can be configured as inputs. For various reasons, the buttons are wired with pull-up resistors as shown in the schematic above, so that the input signal to the chip is +3.3V if the button is not pushed (a logical 1), and drops to 0V (a logical 0) when it is pushed. Lab 2 begins with a program (an electronic Valentine's card) that shows a beating heart pattern, and asks you enhance it so that pressing the buttons changes pattern shown on the display.

Challenges

See if you can work out how to make (by hand if you have the patience) an image hollow that shows a heart shape as an outline: it the solid heart shape without the pixels that are lit in the small heart shape. Make a more elaborate animation that uses all three shapes.
The constant JIFFY determines the time for which each row of an image is displayed. Try changing it: how big can it be before the image visibly flickers?
Code is already present to test the state of the two push-buttons. Modify the program so that it shows different animations when one or both of the buttons are pressed.

Questions

What does the assignment statement GPIO.OUT = 0x4000 look like in assembly language?

GPIO.OUT is defined (in a slightly convoluted way) in hardware.h as an unsigned integer variable at the address 0x0x50000504, a constant obtained from the nRF51822 datasheet. The same assignment can be written directly as

(* (unsigned volatile *) 0x50000504) = 0x4000

The volatile keyword says to the C compiler, "Please just do the assignment now: don't try to be clever and optimise this in any way, such as combining this assignment with a later one that targets the same address."

To achieve this in assembly language, we need an str instruction for the assignment. But both the address being assigned to and the value being assigned are large constants, so we'll need to put the constants in registers first, using the pc-relative load instructions for which "ldr =" is a shorthand. So suitable code is

ldr r0, =0x4000
ldr r1, =0x50000504
str r0, [r1]

Note: that code works fine, and is reasonably efficient. But there's another way of putting the constant 0x4000 in register r0 that takes the same time on the Cortex-M0 and uses marginally less code space:

movs r0, #0x4
lsls r0, r0, #12

That code exploits that fact that the constant only has one non-zero bit, and that (or something very similar) is the code that gcc actually generates in place of the first ldr =.

What does the magic word CODERAM do on V2?

This magic word is defined in hardware.h as an abbreviation for the GCC annotation

__attribute((noinline, section(".xram")))

Functions labelled with this attribute will be collected together into a linker section .xram, different from the section .text that contains the rest of the program. The linker script nRF52833.ld gives instructions to keep this code separate, and the startup code in startup.c copies the code into the right place in RAM. An area of 4kB (out of the 128kB of RAM in the machine) is set aside for this purpose, even in programs that do not use CODERAM; this should be enough for the timing-critical code in any of our programs. The .xram section appears at address 0x81f000 in the address space of the machine, putting it within the range reachable by bl instructions stored in the Flash. In addition, RAM at this address is accessed by a different channel from the same RAM in the space above 0x20000000, speeding up access.

@@ Line 3: / Line 3: @@
 {{FileTable|X2000|
 {{FileRow|@heart.c@|Multiplexed heart animation}}
+{{FileHardware}}
 {{FileScripts|X2000}}}}

X2000: Difference between revisions

Latest revision as of 10:40, 7 October 2024

Files

Demonstration

Activity

Background

Basics of LEDs

Device registers

Multiplexing

Challenges

Questions

Navigation menu

Search